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I.  SUMMARY  OF  RESEARCH  OBJECTIVES 


A.  The  Basic  Problem 

The  basic  problem  considered  in  this  research  is  the  inverse  potential  problem  of  recon¬ 
structing  the  scattering  potential  of  a  multidimensional  Schrodinger  equation  from  impulse 
reflection  response  data  of  several  different  types.  This  problem  has  obvious  applications 
in:  (1)  exploration  seismology,  in  which  the  potential  represents  the  inhomogeneity  of  a 
scattering  medium  which  is  to  be  reconstructed;  (2)  nondestructive  testing,  which  can  be 
formulated  similarly  to  exploration  seismology,  save  for  the  availability  of  more  scattering 
data;  (3)  inverse  resistivity  reconstruction,  in  which  the  potential  is  related  to  the  elec¬ 
trical  resistivity  of  the  medium  and  the  excitation  is  DC  injected  current;  and  (4)  linear 
least-squares  estimation  of  random  fields.  More  details  on  these  problems,  including  their 
mathematical  formulation,  can  be  found  in  Appendix  A  and  the  references  cited  therein. 

The  particular  formulation  addressed  is  the  inverse  scattering  problem  of  reconstruct¬ 
ing  the  scattering  potential  of  a  two-dimensional  scattering  medium  from  its  backscattered 
reflection  response  to  a  planar  impulsive  wave  normally  incident  on  the  medium  from  above. 
This  formulation  is  obviously  directly  related  to  the  applications  noted  above,  since  the 
planar  impulsive  wave  can  be  generated  by  explosive  sources  on  the  surface  of  the  medium 
(e.g.,  the  earth’s  surface),  and  the  response  measured  by  geophones  there.  However,  this 
formulation  is  also  applicable  to  inverse  problems  that  at  first  glance  may  seem  totally  un¬ 
related  to  these.  For  example,  the  problem  of  reconstructing  a  scattering  medium  from  its 
sinusoidal  steady-state  response  to  a  set  of  point  harmonic  sources  (i.e.,  single-frequency 
oscillators)  can  be  formulated  as  this  problem,  even  though  the  source  and  measurements 
are  completely  different  (see  Appendix  A). 

B.  Approach  to  Solving  Problem 

There  are  integral  equation  based  methods  for  solving  the  multidimensional  Schrodinger 
equation  inverse  potential  problem.  These  methods  include  the  generalized  Marchenko  and 
Gel’fand- Levitan  integral  equations  of  Newton.  However,  these  methods  are  numerically 
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untested,  and  would  require  a  huge  amount  of  computation. 

This  project  pursued  an  alternative  approach-layer  stripping.  Layer  stripping  al¬ 
gorithms  recursively  and  differentially  reconstruct  the  scattering  potential,  slice  by  slice, 
and  simultaneously  propagate  the  scattered  wave  field  deeper  into  the  scattering  medium. 
They  require  much  less  computation  than  integral  equation  based  methods,  since  they  ex¬ 
ploit  time  casuality  and  the  Hankel  structure  in  the  integral  equations.  They  are  recursive 
in  one  spatial  dimension,  but  parallelizable  in  time  and  all  other  spatial  dimensions. 

Layer  stripping  algorithms  mathematically  mimic  the  actual  physical  scattering  pro¬ 
cess.  They  operate  by  recursively  reconstructing  the  scattering  potential  at  the  wave  front 
of  the  incident  wave  field,  and  then  using  it  to  mathematically  propagate  the  wave  field 
further.  This  operation  is  similar  to  the  method  of  characteristics  for  partial  differential 
equations,  with  the  added  feature  of  determining  the  coefiicients  from  the  jump  in  the 
scattered  field  at  the  wave  front.  The  algorithms  are  also  related  to  the  Levinson  algo¬ 
rithm  of  linear  least-squares  estimation  theory;  they  exploit  the  Hankel  structure  in  the 
integral  equations,  just  as  the  Levinson  algorithm  exploits  the  Toeplitz  structure  in  the 
Yule- Walker  equations  of  linear  prediction. 

C.  Original  Research  Objectives 

The  main  goal  of  this  project  was  to  develop  fast  algorithms  for  solving  the  multi¬ 
dimensional  inverse  scattering  problem  of  reconstructing  the  potential  of  a  Schrodinger 
equation  from  various  types  of  scattering  data.  To  this  end,  several  specific  tasks  are  now 
defined: 

1.  To  mathematically  develop  the  layer  stripping  idea  (heretofore  only  defined  difiieren- 
tially)  into  a  specific  numerical  algorithm,  and  to  test  it  with  respect  to  the  following 
items: 

a.  Success  in  reconstructing  the  scattering  potential; 

b.  Improvement  over  (simpler)  Born  approximation  reconstructions; 

c.  Robustness  of  results  to  small  amounts  of  additive  noise  in  the  data; 

d.  Robustness  of  results  with  respect  to  variations  in  discretization  grid  size; 


2.  To  develop  algorithms  for  generating  the  scattering  data  to  which  the  layer  stripping 
algorithm  can  be  applied,  i.e.,  solving  the  forward  problem  of  computing  scattering 
data  from  a  known  scattering  potential; 

3.  To  develop  other  iterative  algorithms  which  might  be  applicable  to  solving  the  Marchenko 
and  Gel’fand-Levitan  integral  equations  noted  above.  This  is  necessary  when  the  scat¬ 
tering  data  consists  of  a  far-field  scattering  amplitude. 

D.  Additional  Research  Objectives 

In  the  course  of  pursuing  the  research  required  to  achieve  the  above  goals,  several  other 
issues  arose.  In  some  cases  these  issues,  which  were  not  part  of  our  original  proposal,  turned 
out  to  be  more  significant  than  the  goals  specified  above.  In  other  cases  serendipitous 
results  were  obtained  that  are  of  considerable  interest  in  their  own  right.  These  included 
the  following; 

1.  To  investigate  the  admissibility  of  scattering  data  for  inverse  scattering  problems.  We 
have  discovered  that  this  is  a  crucial  issue  in  applying  layer  stripping  algorithms,  and 
likely  a  highly  significant  issue  in  applying  other  types  of  algorithms  as  well.  This 
work  is  being  developed  further  in  our  renewal  grant; 

2.  To  investigate  how  to  apply  lateral  regularization  in  the  layer  stripping  algorithm. 
Downward  continuation  methods  for  extrapolating  wave  fields  defined  on  planes  through 
a  medium  are  known  to  be  very  ill-conditioned,  since  the  downward  continuation  prob¬ 
lem  itself  is  ill-conditioned.  Clearly  some  sort  of  regularization  is  needed  to  ensure  a 
stable  algorithm.  How  should  this  be  done,  and  more  importantly,  what  regularization 
is  implicitly  being  applied  when  the  algorithm  is  discretized,  as  it  must  be; 

3.  To  investigate  how  the  wavelet  transform  could  be  used  to  implement  layer  stripping 
ideas.  The  idea  here  is  to  use  the  wavelet  transform  as  an  orthonormal  basis  for  rep¬ 
resenting  continuous  functions  of  time  and  space.  We  have  shown  that  in  the  wavelet 
transform  representation  wave  propagation  in  a  layered  one-dimensional  medium  can 
be  represented  by  a  set  of  coupled  discrete  wave  systems  whose  wave  speeds  axe  powers 
of  two.  This  suggests  that  discrete  layer  stripping  ideas  can  be  applied  to  continuous 


problems  in  ways  other  than  by  direct  discretization.  Our  preliminary  results  here 
resulted  in  an  AASERT  grant  under  which  this  is  being  investigated  further; 

4.  A  major  concern  of  ONR  is  reconstruction  of  dielectric  media  from  their  reflection 
response.  Since  dielectric  media  are  usually  absorbing  media  (waves  propagating 
through  them  are  attenuated  due  to  energy  absorption),  loss  must  be  included  in  the 
forward  and  inverse  scattering  problem  formulations; 

5.  To  investigate  algorithms  which  employ  the  above  ideas  and  apply  them  to  closely- 
related  inverse  scattering  problems,  including  diffraction  tomography.  X-ray  tomogra¬ 
phy,  and  positron-emission  tomography. 


11.  SUMMARY  OF  RESEARCH  ACCOMPLISHMENTS 


A.  Original  Research  Objectives 

Our  accomplishments  in  answering  the  original  goals  can  be  summarized  as  follows 
(the  numbering  matches  that  of  the  goals  in  Section  I): 

1.  We  have  successfully  developed  a  layer  stripping  algorithm  for  the  2-D  inverse  scat¬ 
tering  problem  (task  #1),  and  tested  it  with  respect  to  the  factors  listed  under  task 
7^1.  The  algorithm  works  quite  well,  and  successfully  reconstructs  features  of  the 
scattering  potential  that  the  Born  approximation  algorithm  is  unable  to  reconstruct. 
Details  are  given  in  Appendix  A; 

2.  We  have  successfully  implemented  an  invariant  imbedding  algorithm  for  computing 
forward  problem  data  sets  for  the  2-D  inverse  scattering  problem  (task  #2).  This 
algorithm  is  computationally  intensive,  since  it  does  not  use  the  layer  stripping  idea 
(using  the  same  concept  to  generate  forward  data  and  then  solve  the  inverse  problem 
would  raise  the  issue  of  whether  one  algorithm  was  simply  running  the  algorithm 
backwards).  The  algorithm  is  reviewed  in  Appendix  A.  The  computation  time  required 
sparked  our  interest  in  iterative  algorithms  for  solving  the  integral  equations  arising 
in  the  forward  problem; 

3.  We  have  developed  the  generalized  Landweber  iteration  into  a  useful  algorithm  for 
solving  large  systems  of  equations.  We  have  studied  the  numerical  behavior  of  this 
algorithm  in  detail,  discovered  some  important,  new,  useful  properties,  developed  a 
new  convergence  acceleration  procedure,  and  shown  how  to  control  the  filtering  and 
convergence  properties  of  this  algorithm.  Results  are  in  Appendices  F-H.  Although 
the  specific  application  considered  there  is  the  inverse  problem  of  positron  emission 
tomography,  the  algorithm  should  also  prove  useful  in  solving  the  large  systems  of 
equations  arising  from  discretized  forms  of  the  integral  equations  of  forward  and  in¬ 
verse  scattering  (task  #3). 


B.  Additional  Research  Objectives 


Our  accomplishments  in  answering  the  additional  tasks  can  be  summarized  as  follows 

(the  numbering  matches  that  of  the  goals  in  Section  I): 

1.  The  data  admissibility  issue  is  discussed  in  Subsection  IIC  below. 

2.  Downward  continuation,  which  all  2-D  layer  stripping  algorithms  utilize  in  one  way  or 
another,  is  ill-conditioned,  meaning  that  a  small  perturbation  of  the  scattering  data 
results  in  a  huge  change  in  the  extrapolated  wave  field  at  depth.  Hence  some  sort 
of  regularization  is  required.  By  regularization,  we  mean  that  the  problem  must  be 
altered  slightly  to  remove  the  ill-conditioning,  but  the  solution  to  the  altered  problem 
must  retain  the  essential  characteristics  of  the  solution  to  the  original  problem. 

This  issue  was  raised  in  our  paper  Appendix  A.  We  now  note,  in  Appendix  B,  some 
justifications  for  the  lateral  smoothing  regularization  used  in  the  algorithm  of  Ap¬ 
pendix  A.  In  particular,  we  note  that  an  explicitly  discrete  formulation  of  the  2-D 
inverse  scattering  problem  leads  to  the  lateral  smoothing  regularization  used  by  the 
discretized  algorithm  of  Appendix  A,  so  that  the  regularization  is  consistent  with  the 
discrete  nature  of  the  algorithm. 

3.  We  have  applied  the  wavelet  transform,  a  major  hot  current  research  area  in  signal 
processing,  to  the  1-D  inverse  scattering  problem.  We  have  obtained  three  algorithms 
for  this  problem.  One  is  a  layer  stripping  algorithm  that  operates  in  the  1-D  time- 
wavelet  domain,  one  is  a  layer  stripping  algorithm  that  operates  in  the  2-D  space-time 
wavelet  domain,  and  the  third  is  a  linear  system  of  equations  that  comes  from  the  Krein 
integral  equation  in  the  2-D  space-time  wavelet  domain.  Results  are  in  Appendix  C; 

4.  We  have  developed  layer  stripping  algorithms  for  1-D  absorbing  media  and  applied 
these  algorithms  to  problems  in  reconstruction  of  absorbing  dielectric  media  from  their 
reflection  responses.  We  have  also  applied  them  to  reconstruction  of  lossy  transmis¬ 
sion  lines  and  to  the  synthesis  of  dielectric  waveguides.  In  fact,  a  complete  theory 
for  1-D  forward  and  inverse  scattering  problems  for  discrete  layered  lossy  media  has 
been  developed.  By  a  “complete  theory”  we  mean  systems  of  equations  that  are  dis¬ 
crete  counterparts  to  integral  equations,  and  discrete  fast  algorithms  that  solve  these 
systems  of  equations.  Results  are  presented  in  Appendix  El. 

In  addition,  we  have  solved  the  problem  of  using  plane  wave  refiection  response  at  two 


angles  of  incidence,  rather  than  both  reflection  and  transmission  data  (the  latter  would 
not  be  available  in  remote  sensing  applications).  This  leads  to  a  novel  semi-iterative 
use  of  layer  stripping.  Numerical  examples  on  reconstructing  a  glacial  ice  shelf  from 
radar  reflections  demonstrate  the  significance  of  modelling  multiple  reflections  and 
losses.  Results  are  presented  in  Appendix  E2. 

5.  Multiresolution  methods  for  the  inverse  Radon  transform  are  discussed  in  Subsection 
IID  below. 

C.  Data  Admissibility 

There  has  been  considerable  resistance  to  the  idea  of  layer  stripping  in  general,  due 
to  the  belief  that  it  is  inherently  numerically  unstable  in  the  presence  of  noise.  This 
stems  from  the  reputation,  dating  back  to  the  1950s,  that  what  was  then  called  “dynamic 
deconvolution”  was  numerically  unstable  in  noise,  as  confirmed  by  numerical  simulations. 

However,  the  Schur  algorithm,  which  is  the  most  basic  form  of  layer  stripping,  is 
known  to  be  numerically  stable.  Furthermore,  our  basic  multidimensional  layer  stripping 
algorithm  has  been  shown  to  be  numerically  stable  in  small  amounts  of  additive  noise  (see 
Appendix  A).  This  seems  to  contradict  the  conventional  wisdom  noted  above. 

This  contradiction  can  be  resolved  by  noting  that  a  major  concept  that  has  received 
insufficient  attention  is  the  admissibility  of  the  scattering  data  from  which  the  scattering 
potential  is  to  be  reconstructed.  A  data  set  is  admissible  if  there  exists  a  scattering  potential 
which  would  generate  this  (noisy)  data  set,  in  the  absence  of  noise.  An  inadmissible  data 
set  could  not  possibly  have  arisen  from  any  scattering  potential,  and  so  must  be  “wrong.”  A 
noisy  but  admissible  data  set  will  result  in  reconstruction  of  a  “noisy”  scattering  potential, 
but  at  least  there  is  a  potential  which  can  be  associated  with  the  noisy  data. 

The  significance  of  admissibility  of  data  is  as  follows.  First,  recall  that  layer  stripping 
algorithms  can  be  viewed  as  fast  algorithm  solutions  of  integral  equations  with  Toeplitz 
or  Hankel  structure.  This  has  been  well  established  in  the  1-D  case  (see  Bruckstein,  Levy, 
and  Kailath,  SIAM  J.  Applied  Math.  45,  312-335  (1985))  and  extended  to  the  3-D  case 
in  a  series  of  papers  by  the  Principal  Investigator.  Second,  recall  that  these  algorithms 


reconstruct  the  potential  recursively,  so  that  they  solve  a  series  of  subproblems  on  the  way 
to  solving  the  final  problem.  This  requires  not  only  that  the  integral  equation  kernel  be 
positive  definite,  but  that  the  kernels  associated  with  the  subproblems  also  be  positive 
definite.  Finally,  it  can  be  shown  that  these  conditions  are  necessarily  true  for  correct 
scattering  data. 

Hence,  if  these  conditions  are  not  fulfilled,  some  of  the  subproblems  to  be  solved  by 
the  layer  stripping  algorithm  will  not  be  solvable,  and  the  algorithm  will  fail,  by  becoming 
unstable  numerically.  This  is  a  classic  case  of  “garbage  in,  garbage  out”-if  the  data  set  is 
infeasible,  it  is  nonsensical^  and  the  “fault”  of  the  layer  stripping  algorithm  is  that  it  cannot 
make  sense  out  of  nonsense.  The  algorithm  should  fail,  and  does.  Indeed,  any  algorithm 
that  succeeds  on  this  data  set  must  either:  (1)  be  making  some  approximation  under  which 
the  data  set  becomes  admissible  (e.g.,  the  Born  approximation);  or  (2)  somehow  renders 
the  inadmissible  data  admissible. 

This  raises  the  issue  of  how  to  make  an  inadmissible  data  set  admissible,  i.e.,  how 
should  noisy  scattering  data  be  altered  so  that  it  is  admissible  (although  still  noisy)?  If 
this  can  be  done  easily,  layer  stripping  algorithms  applied  to  the  admissible  data  will  be 
numerically  stable.  The  reconstructed  potential  will  still  be  noisy,  but  at  least  it  can  be 
reconstructed. 

We  make  a  start  at  answering  these  issues  in  Appendix  B.  Specifically,  we  derive  a 
feasibility  condition  on  1-D  and  2-D  free-surface  refiection  responses  from  scattering  media. 
We  present  numerical  examples  that  demonstrate  how: 

1.  The  Born  approximation  is  insufficient  to  reconstruct  strongly  scattering  media,  since 
it  neglects  multiple  scattering; 

2.  Noise  added  to  the  data  can  make  layer  stripping  algorithms  diverge  (’’blow  up”); 

3.  Rendering  the  noisy  data  admissible  malses  the  layer  stripping  algorithm  stable.  The 
reconstruction  is  still  noisy,  but  this  is  because  the  data  axe  also  still  noisy.  However, 
noisy  data  need  not  make  the  algorithm  diverge,  if  the  noisy  data  axe  still  admissible. 

D.  Multiresolution  Methods  for  Inverting  the  Radon  Transform 
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We  have  not  totally  forsaken  the  Born  approximation  under  this  grant.  The  2-D 
inverse  scattering  problem  with  the  Born  approximation  becomes  the  problem  of  recon¬ 
struction  from  projections,  i.e.,  inverting  the  Radon  transform.  This  problem  has  many 
applications  in  medical  imaging  (where  it  is  the  problem  of  x-ray  tomography)  and  non¬ 
destructive  evaluation  (NDE). 

This  project  has  used  time-frequency  methods,  especially  the  wavelet  transform,  to 
the  following  three  problems  in  image  reconstruction  from  projections: 

1.  We  have  derived  a  fast  image-domain  filter  which  solves  the  following  constrained 
inverse  Radon  transform  problem:  Given  constraints  on  certain  wavelet  coefficients 
of  the  image,  compute  from  its  projections  the  image  which  either:  (a)  requires  the 
smallest  perturbation  of  the  projection  data  to  satisfy  these  constraints;  or  (b)  is 
the  constrained  linear  least-squares  image  estimate.  The  wavelet  transform  can  be 
used  for  spatially-varying  filtering  of  an  image,  suppressing  noise  locally  in  smooth 
regions;  we  also  discuss  detection  of  such  regions  in  a  noisy  image,  which  leads  to 
the  wavelet  coefficient  constraints.  Numerical  results  show  improvement  over  filtered 
images,  since  the  constraints  improve  the  reconstruction  in  non-constrained  areas  as 
well.  These  results  are  presented  in  Appendices  Jl,  J2,  and  0. 

2.  We  have  shown  that  when  the  extent  of  missing  angles  is  small  in  limited-angle  to¬ 
mography,  two  of  the  three  sets  of  detail  images  in  the  wavelet  transform  are  unaf¬ 
fected,  and  low- resolution  images  can  be  obtained  by  interpolation.  Using  some  a 
priori  partial  information  on  edges  parallel  to  the  missing  angles,  we  have  developed 
a  wavelet-domain  algorithm  for  restoring  the  image.  These  results  are  presented  in 
Appendix  M. 

3.  We  have  shown  that  the  local  tomography  problem  of  reconstructing  only  a  small 
region  of  interest  (ROI)  from  a  limited  set  of  projections  can  be  solved  by  sampling 
the  projections  at  a  rate  that  decreases  exponentially  with  distance  from  the  ROI. 
This  reconstructs  the  ROI  with  high  resolution,  and  the  remainder  of  the  image  at 
lower  resolution.  The  algorithm  is  also  much  faster  than  conventional  filtered  back- 
projection.  These  results  are  presented  in  Appendix  N. 

4.  We  have  developed  a  layer-stripping-type  algorithm  for  the  causal  formulation  of  the 


problem  of  reconstructing  a  function  from  certain  values  of  its  spherical  means.  We 
have  also  shown  that  this  problem  has  an  important  new  application  in  diffraction 
tomography.  Results  are  in  Appendix  D. 

5.  We  have  developed  a  parallel  implementation  of  the  1-D  invariant  imbedding  algo¬ 
rithms.  Error  analysis  of  this  algorithm  shows  that  the  error  is  the  same  order  of 
magnitude  as  the  discretization  error.  However,  the  resulting  algorithm  is  highly 
parallelizable.  Results  are  presented  in  Appendix  T. 
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IV.  SUMMARIES  OF  APPENDICES 


Technical  details  of  our  research  accomplishments  are  provided  in  the  Appendices. 
Each  of  these  is  a  published  journal  article  or  conference  paper  or  a  submitted  journal 
article.  They  include  details  of  the  problem  formulations,  the  results,  and  their  applications 
and  significance.  We  now  briefiy  summarize  their  contents.  All  of  the  results  presented 
below  are  new,  unless  otherwise  indicated. 

APPENDIX  A 

A.E.  Yagle  and  P.  Raadhakrishnan,  “Numerical  Performance  of  Layer 
Stripping  Algorithms  for  Two-Dimensional  Inverse  Scattering  Problems,”  In¬ 
verse  Problems  8(4),  645-665,  August  1992. 

This  paper  summarizes  our  results  to  date  on  the  numerical  performance  of  the  2-D 
layer  stripping  algorithm  with  respect  to  the  factors  listed  in  subsection  IC-2.  It  also  in¬ 
cludes  a  derivation  of  the  2-D  invariant  imbedding  algorithm  used  to  produce  the  forward 
problem  scattering  data;  this  algorithm  is  based  on  a  previous  algorithm  (see  references). 
Since  the  forward  problem  and  inverse  problem  algorithms  operate  differently,  this  con¬ 
stitutes  a  much  better  test  of  the  inversion  algorithm.  A  problem  with  this  algorithm  is 
the  enormous  amount  of  computation  required  to  generate  the  forward  problem  data;  our 
inverse  problem  algorithm  is  MUCH  faster,  even  though  the  inverse  problem  is  mathe¬ 
matically  more  difficult.  The  Born  approximation  to  the  layer  stripping  algorithm  is  also 
derived  and  discussed. 

The  reconstructed  potentials  agree  closely  with  the  original  potentials,  even  for  coarse 
(16  X  16)  discretization  grids;  the  agreeement  becomes  almost  exact  for  a  (64  x  64)  grid. 
This  clearly  demonstrates  that  the  layer-stripping  concept  is  numerically  viable. 

APPENDIX  B 

A.E.  Yagle,  “On  the  Feasibility  of  Impulse  Reflection  Response  Data  for 
the  Two-Dimensional  Inverse  Scattering  Problem,”  submitted  to  IEEE  Trans. 
Antennas  and  Propagation 

The  contents  of  this  paper  are  described  in  some  detail  in  Section  IIC. 


APPENDIX  C 


A.E.  Yagle,  “Multiresolution  Algorithms  for  Solving  One-Dimensional  In¬ 
verse  Scattering  Problems  Using  the  Wavelet  Transform,” 

This  paper  applies  the  wavelet  transform  to  the  1-D  inverse  scattering  problem.  We 
derive  a  layer  stripping  algorithm  that  uses  as  input  the  wavelet  transform  of  the  impulse 
reflection  response  of  the  medium.  The  wavelet  transform  decouples  the  layer  stripping 
algorithm  into  a  set  of  wave  systems  at  differing  wave  speeds.  Any  of  these  multiple- 
resolution  systems  could  be  used  to  reconstruct  the  medium;  this  allows  some  flexibility 
on  how  the  data  are  used. 

We  also  derive  both  a  layer  stripping  algorithm,  and  a  linear  system  of  equations,  in 
the  2-D  (time  and  space)  wavelet  transform  domain,  from  the  layer  stripping  algorithm  and 
Krein  integral  equation,  respectively.  These  results  show  how  data  at  one  resolution  affects 
the  reconstruction  at  another  resolution.  They  are  interpreted  in  terms  of  fast  algorithms 
for  the  slanted  Toeplitz  structured  linear  system  of  equations.  The  Born  approximation  is 
also  derived  and  discussed. 

This  paper  formed  the  basis  of  a  successful  AASERT  proposal  to  ONR. 

APPENDIX  D 

A.E.  Yagle,  “Inversion  of  Spherical  Means  Using  Geometric  Inversion  and 
Radon  Transforms,”  Inverse  Problems  8(6),  949-964,  December  1992. 

This  paper  analyzes  the  problem  of  reconstructing  a  function  from  its  spherical  means 
passing  through  the  origin.  A  new  application  of  this  problem  to  diffraction  tomography 
is  noted:  We  show  that  given  probing  by  impulsive  plane  waves  at  all  angles  of  incidence, 
only  a  single  receiving  sensor  is  necessary,  not  an  array  of  sensors. 

Two  versions  of  the  problem  are  defined.  A  layer-stripping-type  algorithm  is  derived 
for  one  version  (we  use  the  term  invariant  imbedding  in  the  paper,  since  it  is  more  familiar 
to  readers,  but  it  is  really  layer  stripping).  The  two  versions  are  shown  to  be  equivalent  to 
the  usual  and  exterior  inverse  Radon  transforms,  respectively,  using  geometric  inversion 
(reflection  about  a  circle).  A  simple  numerical  example  is  also  included. 


APPENDIX  El 


J.  Frolik  and  A.E.  Yagle,  “Forward  and  Inverse  Scattering  for  Discrete 
Layered  Lossy  and  Absorbing  Media,”  submitted  to  IEEE  Trans.  Circuits  and 
Systems 

A  complete  theory  for  the  1-D  forward  and  inverse  scattering  problems  for  discrete 
layered  lossy  media  is  presented.  By  a  “complete  theory,”  we  mean  systems  of  equations 
that  are  discrete  counterparts  to  integral  equations,  and  discrete  fast  algorithms  that 
solve  these  systems  of  equations.  Applications  to  discrete  lossy  transmission  lines,  and 
to  electromagnetic  wave  propagation  in  absorbing  dielectrics,  are  made,  and  numerical 
examples  presented. 


APPENDIX  E2 

J.  Frolik  and  A.E.  Yagle,  “Reconstruction  of  Multi-Layered  Lossy  Di¬ 
electrics  from  Plane  Wave  Impulse  Responses  at  Two  Angles  of  Incidence,” 
IEEE  Trans.  Geosci.  and  Rem.  Sensing  33(2),  268-279,  March  1995. 

The  problem  posed  in  Appendix  El  is  solved  using  plane  wave  reflection  response 
at  two  angles  of  incidence,  rather  than  reflection  and  transmission  data  (the  latter  would 
not  be  available  in  remote  sensing  applications).  This  includes  a  novel  seni-iterative  use 
of  layer  stripping.  Numerical  examples  on  reconstructing  a  glacial  ice  shelf  from  radar 
reflections  demonstrate  the  signiflcance  of  modelling  multiple  reflections  and  losses. 

APPENDIX  FI 

T.-S.  Pan  and  A.E.  Yagle,  “Acceleration  and  Filtering  in  the  Generalized 
Landweber  Iteration  using  a  Variable  Shaping  Matrix,”  IEEE  Trans.  Medical 
Imaging  12(2),  278-286,  June  1993. 

This  paper  discusses  the  generalized  Landweber  iteration  for  solving  large  linear  sys¬ 
tems  of  equations.  We  show  how  the  convergence  and  filtering  behavior  of  this  algorithm 
can  be  tightly  controlled,  in  contrast  to  most  iterative  algorithms  whose  behavior  cannot 
be  controlled  and  that  simply  go  where  they  may.  This  paper  is  a  good  introduction  to 
the  algorithm  and  how  to  design  it  to  obtain  desired  behavior. 


Although  the  specific  application  investigated  here  is  positron  emission  tomography, 
the  results  could  also  be  applied  to  discretized  integral  equations,  such  as  the  gener¬ 
alized  Gel’fand-Levitan  or  Marchenko  integral  equations.  Although  the  matrix  kernels 
are  no  longer  sparse,  a  projection  or  backprojection  (multiplication  by  the  kernel  or  its 
transpose)  can  be  implemented  quickly  using  FFT-based  convolution  methods;  number- 
theoretic  transforms  would  require  even  fewer  multiplications. 

APPENDIX  F2 

T.-S.  Pan  and  A.E.  Yagle,  “Acceleration  and  Filtering  in  the  Generalized 
Landweber  Iteration  using  a  Variable  Shaping  Matrix,”  IEEE  1991  Medical 
Imaging  Conference,  Santa  Fe,  NM,  Nov.  5-9,  1991,  pp.  2028-2032. 

This  is  the  conference  paper  version  of  Appendix  FI. 

APPENDIX  G1 

T.-S.  Pan  and  A.E.  Yagle,  “Acceleration  of  Landweber- Type  Algorithms 
by  Suppression  of  Projection  on  the  Maximum  Singular  Vector,”  IEEE  Trans. 
Medical  Imaging  11(4),  479-487,  December  1992. 

This  paper  presents  a  simple  procedure  to  significantly  accelerate  the  convergence  of 
the  generalized  Landweber  iteration. 
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T.-S.  Pan  and  A.E.  Yagle,  “Acceleration  of  Landweber- Type  Algorithms 
by  Suppression  of  Projection  on  the  Maximum  Singular  Vector,”  IEEE  1991 
Medical  Imaging  Conference,  Santa  Fe,  NM,  Nov.  5-9,  1991,  pp.  2023-2027. 

This  is  the  conference  paper  version  of  Appendix  Gl. 

APPENDIX  HI 

T.-S.  Pan  and  A.E.  Yagle,  “Numerical  Study  of  Multigrid  Implementations 
of  Some  Iterative  Image  Reconstruction  Algorithms,”  IEEE  Trans.  Medical 
Imaging  10(4),  572-588,  December  1991. 

This  paper  investigates  the  use  of  several  iterative  algorithms,  including  the  gener- 
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alized  Landweber  iteration,  in  multigrid  image  reconstruction.  The  image  is  first  recon¬ 
structed  quickly  on  a  coarse  grid.  This  coarse  image  is  then  used  as  the  initialization  for 
reconstruction  of  the  image  on  a  fine  grid.  Many  numerical  examples  are  used  to  illustrate 
the  performance  of  various  algorithms. 

APPENDIX  H2 

T.-S.  Pan  and  A.E.  Yagle,  “Numerical  Study  of  Multigrid  Implementa¬ 
tions  of  Some  Iterative  Image  Reconstruction  Algorithms,”  IEEE  1991  Medical 
Imaging  Conference,  Santa  Fe,  NM,  Nov.  5-9,  1991,  pp.  2033-2037. 

This  is  the  conference  paper  version  of  Appendix  HI. 

APPENDIX  I 

P.  Raadhakrishnan,  A.E.  Yagle,  B.V.  Rao,  and  J.E.  Dorband,  “On  Up¬ 
per  Bounds  of  the  Equivalent  Oscillator  and  Notch-Filter  Circuits;  A  Non- 
Commutative  Group  Theoretic  Approach,”  IEEE  Trans.  Circuits  and  Systems 
I  39(9),  756-759,  Sept.  1992. 

This  paper  is  a  minor  work  that  uses  group  theory  to  aid  in  designing  oscillator 
circuits.  The  application  is  quite  unusual  and  novel. 

APPENDIX  J1 

B.  Sahiner  and  A.E.  Yagle,  “Image  Reconstruction  from  Projections  Under 
Wavelet  Constraints,”  IEEE  Trans.  Sig.  Proc.  41(12),  3579-3584  December 
1993  (special  issue  on  wavelets). 

This  paper  considers  the  problem  of  image  reconstruction  from  projections,  given 
constraints  not  on  the  image,  but  on  certain  wavelet  coefficients  of  the  image.  The  idea 
is  that  low-resolution  regions  of  the  image  can  be  locally  low-pass  filtered  by  setting  high- 
resolution  wavelet  coefficients  to  zero.  These  are  then  used  as  constraints  on  the  image 
reconstruction  process,  so  that  other  areas  of  the  reconstructed  image  axe  improved  as 
well.  The  constraints  are  implemented  as  a  simple  filter  directly  on  the  image. 


APPENDIX  J2 


B.  Sahiner  and  A.E.  Yagle,  “On  the  Use  of  Wavelets  in  Inverting  the  Radon 
Transform,”  IEEE  1992  Medical  Imaging  Conference,  Orlando,  FL,  Oct.  25-31, 
1992,  pp.  1129-31. 

This  is  the  conference  paper  version  of  Appendix  Jl. 

APPENDIX  K1 

B.  Sahiner  and  A.E.  Yagle,  “Time-Frequency  Distribution  Inversion  of  the 
Radon  Transform,”  IEEE  Trans.  Image  Proc.  2(4),  539-543,  October  1993. 

This  paper  performs  a  time-frequency  analysis  of  the  projection  data  in  the  inverse 
Radon  transform  problem.  Regions  in  time-frequency  space  in  which  the  distribution 
strength  is  below  a  threshold  axe  assumed  to  be  due  to  noise,  and  are  set  to  zero.  This 
has  the  effect  of  filtering  noise  out  of  time-frequency  regions  in  which  the  signal  strength 
is  small,  and  leaving  the  noise  in  where  the  signal  strength  is  large.  The  resulting  time- 
frequency  distribution  is  then  projected  to  find  the  nearest  feasible  signal  solution,  which 
is  then  backprojected.  This  reduces  noise  in  the  reconstructed  image  while  maintaining 
sharpness  of  image  features. 


APPENDIX  K2 

B.  Sahiner  and  A.E.  Yagle,  “Time-Frequency  Distribution  Inversion  of  the 
Radon  Transform,”  IEEE  1991  Medical  Imaging  Conference,  Santa  Fe,  NM, 
Nov.  5-9,  1991,  pp.  2043-2047. 

This  is  the  conference  paper  version  of  Appendix  Kl. 

APPENDIX  LI 

B.  Sahiner  and  A.E.  Yagle,  “A  Fast  Algorithm  for  Backprojection  with 
Linear  Interpolation,”  IEEE  Trans.  Image  Proc.  2(4),  547-550,  October  1993. 

This  paper  derives  a  simple  fast  algorithm  for  backprojection  in  the  inverse  Radon 
transform.  Interpolating  and  backprojecting  four  views  at  once  saves  half  the  multiplica¬ 
tions. 
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B.  Sahiner  and  A.E.  Yagle,  “A  Fast  Algorithm  for  Backprojection,”  IEEE 
1992  Medical  Imaging  Conference,  Orlando,  FL,  Oct.  25-31, 1992,  pp.  1169-71. 

This  is  the  conference  paper  version  of  Appendix  LI. 

APPENDIX  M 

B.  Sahiner  and  A.E.  Yagle,  “Limited  Angle  Tomography  Using  the  Wavelet 
Transform,”  revision  submitted  to  IEEE  Trans.  Image  Proc. 

This  paper  shows  that  when  the  extent  of  missing  angles  is  small  in  limited-angle 
tomography,  two  of  the  three  sets  of  detail  images  in  the  wavelet  transform  are  unaffected, 
and  low-resolution  images  can  be  obtained  by  interpolation.  Using  some  a  priori  partial 
information  on  edges  parallel  to  the  missing  angles,  we  have  developed  a  wavelet-domain 
algorithm  for  restoring  the  image. 


APPENDIX  N 

B.  Sahiner  and  A.E.  Yagle,  “Region-of-Interest  Tomography  using  Expo¬ 
nential  Radial  Sampling,”  to  appear  in  IEEE  Trans.  Image  Proc.  4(9),  August 
1995. 

This  paper  shows  that  the  local  tomography  problem  of  reconstructing  only  a  small 
region  of  interest  (ROI)  from  a  limited  set  of  projections  can  be  solved  by  sampling  the 
projections  at  a  rate  that  decreases  exponentially  with  distance  from  the  ROI.  This  recon¬ 
structs  the  ROI  with  high  resolution,  and  the  remainder  of  the  image  at  lower  resolution. 
The  algorithm  is  also  much  faster  than  conventional  filtered  backprojection. 

APPENDIX  0 

B.  Sahiner  and  A.E.  Yagle,  “Reconstruction  from  Projections  under  Time- 
Frequency  Constraints,”  to  appear  in  IEEE  Trans.  Med.  Imag.  14(2),  June 
1995. 

This  paper  derives  a  fast  image-domain  filter  which  solves  the  following  constrained 
inverse  Radon  transform  problem:  Given  constraints  on  certain  wavelet  coefiicients  of  the 
image,  compute  from  its  projections  the  image  which  either:  (a)  requires  the  smallest 
perturbation  of  the  projection  data  to  satisfy  these  constraints;  or  (b)  is  the  constrained 


linear  least-squares  image  estimate.  The  wavelet  transform  can  be  used  for  spatially- 
varying  filtering  of  an  image,  suppressing  noise  locally  in  smooth  regions;  we  also  discuss 
detection  of  such  regions  in  a  noisy  image,  which  leads  to  the  wavelet  coefficient  constraints. 
Numerical  results  show  improvement  over  filtered  images,  since  the  constraints  improve 
the  reconstruction  in  non-constrained  areas  as  well. 

APPENDIX  P 

H.  Soltanian-Zadeh,  J.P.  Windham,  D.J.  Peck,  and  A.E.  Yagle,  “A  Com¬ 
parative  Analysis  of  Several  Transformations  for  Enhancement  and  Segmenta¬ 
tion  of  Magnetic  Resonance  Image  Scene  Sequences,”  IEEE  Trans.  Medical 
Imaging  11(3),  302-318,  September  1992. 

Its  title  describes  this  paper  very  well. 

APPENDIX  Q1 

H.  Soltanian-Zadeh,  J.P.  Windham  and  A.E.  Yagle,  “Optimal  Transforma¬ 
tion  for  Correcting  Partial  Volume  Averaging  Effects  in  Magnetic  Resonance 
Imaging,”  IEEE  Trans.  Nuclear  Science  40(4),  1204-1212,  August  1993. 

Again,  its  title  describes  this  paper  very  well. 

APPENDIX  Q2 

H.  Soltanian-Zadeh,  J.P.  Windham  and  A.E.  Yagle,  “Optimal  Transforma¬ 
tion  for  Correcting  Partial  Volume  Averaging  Effects  in  Magnetic  Resonance 
Imaging,”  IEEE  1992  Medical  Imaging  Conference,  Orlando  FL,  Oct.  25-31, 
1992,  pp.  1289-91. 

This  is  the  conference  paper  version  of  Appendix  Ql. 

APPENDIX  R1 

H.  Soltanian-Zadeh,  R.  Saigal,  J.P.  Windham,  A.E.  Yagle,  and  D.O.  Hearshen, 
“Optimization  of  MRI  Protocols  and  Pulse  Sequence  Parameters  for  Eigenim- 
age  Filtering,”  IEEE  Trans.  Medical  Imaging  13(1),  161-175,  March  1994. 

This  paper  proposes  a  procedure  for  optimizing  the  acquisition  of  MRI  scene  se- 


quences,  if  eigenimage  filtering  (see  Appendix  P)  is  then  used  to  process  the  MRI  scene 
sequence. 


APPENDIX  R2 

H.  Soltanian-Zadeh,  A.E.  Yagle,  J.P.  Windham,  and  D.O.  Hearshen,  “Op¬ 
timization  of  MRI  Protocols  and  Pulse  Sequence  Parameters  for  Eigenimage 
Filtering,”  IEEE  1992  Medical  Imaging  Conference,  Orlando,  FL,  Oct.  25-31, 
1992,  pp.  1325-27. 

This  is  the  conference  paper  version  of  Appendix  Rl. 

APPENDIX  S 

H.  Soltanian-Zadeh,  J.P.  Windham  and  A.E.  Yagle,  “A  Multidimensional 
Non-Linear  Edge-Preserving  Filter  for  Magnetic  Resonance  Image  Restora¬ 
tion,”  IEEE  Trans.  Image  Proc.  4(2),  February  1995. 

Although  the  edge-preserving  filter  with  locally- varying  properties  was  designed  specif¬ 
ically  for  MRI,  it  may  have  applications  elsewhere. 

APPENDIX  T 

P.  Raadhakrishnan,  J.  Dorband,  and  A.E.  Yagle,  “An  Algorithm  for  For¬ 
ward  and  Inverse  Scattering  in  the  Time  Domain,”  to  appear  in  J.  Acoust. 
Soc.  Am.,  April  1995. 

As  noted  above,  the  invariant  imbedding  algorithm  used  to  generate  the  forward 
problem  data  is  computationally  intensive.  This  is  because  invariant  imbedding,  although 
similar  to  layer  stripping  in  approach,  is  quite  different,  in  that  it  does  not  take  advantage 
of  time  causality.  Hence  it  solves  many  more  problems  than  are  actually  needed.  Layer 
stripping  avoids  this  and  is  more  efficient. 

This  paper  is  a  first  attempt  at  parallehzing  invariant-imbedding-based  algorithms  for 
both  the  forward  and  the  inverse  problems,  and  thus  reducing  the  computation  required. 
Only  the  1-D  problem  is  considered  here.  The  new  algorithm  is  parallelizable  and  gives 
results  identical  to  the  invariant  imbedding  algorithm  of  Corones  et  al.  An  simple  error 


analysis  of  the  effects  of  computational  noise  on  the  algorithm  is  also  supplied. 


APPENDIX  A 

A.E.  Yagle  and  P.  Raadhakrishnan,  “Numerical  Performance  of  Layer 
Stripping  Algorithms  for  Two-Dimensional  Inverse  Scattering  Problems,”  In¬ 
verse  Problems  8(4),  645-665,  August  1992. 

This  paper  summarizes  our  results  to  date  on  the  munerical  performance  of  the  2-D 
layer  stripping  algorithm  with  respect  to  the  factors  hsted  in  subsection  IC-2.  It  also  in¬ 
cludes  a  derivation  of  the  2-D  invariant  imbedding  algorithm  used  to  produce  the  forward 
problem  scattering  data;  this  algorithm  is  based  on  a  previous  algorithm  (see  references). 
Since  the  forward  problem  and  inverse  problem  algorithms  operate  differently,  this  con¬ 
stitutes  a  much  better  test  of  the  inversion  algorithm.  A  problem  with  this  algorithm  is 
the  enormous  amoimt  of  computation  required  to  generate  the  forward  problem  data;  our 
inverse  problem  algorithm  is  MUCH  faster,  even  though  the  inverse  problem  is  mathe¬ 
matically  more  difficult.  The  Born  approximation  to  the  layer  stripping  algorithm  is  also 
derived  and  discussed. 

The  reconstructed  potentials  agree  closely  with  the  original  potentials,  even  for  coarse 
(16  X  16)  discretization  grids;  the  agreeement  becomes  almost  exact  for  a  (64  x  64)  grid. 
This  clearly  demonstrates  that  the  layer-stripping  concept  is  numerically  viable. 


Numerical  performance  of  layer  stripping  algorithms  for 
two-dimensional  Inverse  scattering  problems 
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Revised  April  1992 


Abstract 

Numerical  results  of  implementing  a  two-dimensional  layer  stripping  algorithm  to  solve 
the  two-dimensional  Schrodinger  equation  inverse  potential  problem  are  presented  and 
discussed.  This  is  the  first  exact  (all  multiple  scattering  and  diffraction  effects  axe  included) 
numerical  solution  of  a  multi-dimensional  Schrodinger  equation  inverse  potential  problem, 
excluding  optimization-based  approaches.  The  resvdts  axe  as  follows:  (1)  the  layer  stripping 
algorithm  successfully  reconstructed  the  potentied  from  scattering  data  meeisured  on  a  plane 
(as  it  would  be  in  many  applications);  (2)  the  algorithm  avoids  mvdtiple  scattering  errors 
present  in  Born  approximation  reconstructions;  and  (3)  the  algorithm  is  insensitive  to  small 
amounts  of  noise  in  the  scattering  data.  Simplifications  of  layer  stripping  and  invariant 
imbedding  algorithms  vmder  the  Bom  approximation  axe  also  discussed. 
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1.  Introduction 


The  inverse  scattering  problem  for  the  Schrodinger  equation  in  two  dimensions  with  a 
time-independent,  local,  non-circulaxly  symmetric  potential  has  many  applications.  Two 
of  these  applications  are  as  follows:  (1)  reconstruction  of  a  three-dimensional  (3-D)  acous¬ 
tic  medium  with  density  and  wave  speed  varying  in  two  dimensions  (2-D),  from  surface 
measurements  of  the  steady-state  medium  displacement  response  to  a  harmonic  line  source 
[1];  and  (2)  reconstruction  of  a  3-D  electrical  medium  with  resistivity  varying  in  2-D  from 
surface  measurements  of  the  potential  restilting  from  a  line  DC  current  source  [2].  Both  of 
these  applications  are  qmckly  reviewed  below  in  Section  2.1. 

Two  major  approaches  for  exact  solution  of  the  2-D  Schrodinger  equation  inverse 
potential  problem  have  been  proposed.  The  first  is  the  2-D  version  of  the  Gel’fand- Levitan 
and  Marchenko  integral  equation  methods  [3].  The  other  is  the  2-D  version  of  the  layer 
stripping  differential  methods  [4].  Here  “exact”  means  that  all  diffraction  and  mtJtiple 
scattering  effects  are  included  in  the  mathematical  solution;  errors  in  the  solution  will 
arise  solely  due  to  purely  numerical  effects  such  as  discretization  and  roimdoff.  Hence  all 
methods  based  on  the  Bom  (single-scattering)  approximation  are  excluded  here,  since  such 
methods,  and  their  modifications,  do  not  take  into  account  all  multiple  scattering  effects. 
In  Section  2.4  we  discuss  how  the  Bora  approximation  applies  to  the  algorithm  of  [4].  No 
numerical  implementation  of  the  methods  of  either  [3]  or  [4]  has  previously  been  reported. 

This  paper  presents  the  results  of  the  first  numerical  implementation  of  the  2-D  version 
of  the  layer-stripping  algorithm  of  [4].  It  is  thus  the  first  exact  (as  defined  above)  munerical 
solution  of  a  multi-dimensioniil  Schrodinger  equation  inverse  potential  problem.  Note 
that  optimization-based  approaches  minimize  (or  maximize)  some  criterion;  thus  they  are 
not  in  the  spirit  of  the  approach  considered  here.  Although  only  reconstruction  of  the 
Schrodinger  scattering  potential  is  considered  here,  direct  application  to  specific  inverse 
scattering  problems,  as  in  [1]  emd  [2],  would  be  possible. 

This  paper  is  organized  as  follows.  In  Section  2  the  2-D  Schrodinger  equation  inverse 
potential  problem  is  formulated,  two  applications  are  noted,  the  layer  stripping  algorithm 
of  [4]  is  reviewed,  details  of  its  numerical  implementation  are  discussed,  and  its  simplifi- 
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cation  under  the  Born  approximation  is  discussed.  In  Section  3  the  invariant  imbedding 
algorithm  of  [5]  used  to  generate  the  scattering  data  is  reviewed,  and  details  of  its  nu¬ 
merical  implementation  are  discussed.  We  also  discuss  its  simplification  under  the  Born 
approximation,  and  show  analytically  that  the  Born-simplified  layer  stripping  algorithm 
successfully  inverts  the  Born-simplified  invariant  imbedding  algorithm  scattering  data.  Al¬ 
though  the  latter  result  is  new,  it  is  intended  primarily  to  give  some  feel  for  the  algorithms 
of  [4]  and  [5]. 

Section  4  summarizes  the  niimerical  results,  and  presents  some  illustrative  examples. 
Issues  illustrated  include:  (1)  errors  in  reconstructed  potentials  using  the  Bom  approxi¬ 
mation,  which  are  eliminated  using  the  “exact”  layer  stripping  algorithm;  (2)  effects  on 
reconstructed  potentials  of  various  amounts  of  noise  in  the  data;  (3)  effects  on  reconstmcted 
potentials  of  regularization  of  transverse  derivatives  in  the  layer  stripping  algorithm;  and 
(4)  effects  of  choosing  various  discretization  lengths  in  the  layer  stripping  algorithm.  Sec¬ 
tion  5  concludes  with  a  summary. 

2.  Two-Dimensional  Layer  Stripping  Algorithm 

2.1  Problem  Formulation  and  Applications 

The  2-D  inverse  scattering  problem  considered  in  this  paper  is  as  follows.  The  problem 
is  defined  in  2-D  (x,  z)  space,  where  x  is  lateral  position  and  z  is  depth,  increasing  downward 
from  the  surface  z  =  0,  The  wave  field  ^x,  z,  k)  satisfies  the  2-D  Schrodinger  equation 

+  ^ -V{x,z)^  p{x,z,k)  =  0,  (2.1) 

where  the  potential  V{x,z)  is  real- valued,  smooth,  and  has  support  in  z  in  the  interval 
0  <  2  <  X.  It  is  also  assumed  that  V’(x,z)  does  not  induce  boimd  states;  a  sufficient 
condition  for  this  is  for  V{x,z)  to  be  non-negative. 

The  medium  is  probed  by  an  impulsive  plane  wave  which  passes  through  the 

surface  2  =  0  at  time  t  =  0  and  induces  scattering  by  V{x,z)  for  t  >  0.  The  scattering 
data  consists  of  measurements  of  the  wave  field  p{x,z*,k)  and  its  gradient  for 

some  2*  in  the  homogeneous  half-space  2  >  0.  For  convenience,  we  asstime  measurements 


axe  taien  at  the  surface  2*  =  0,  as  they  would  be  in  the  applications  to  follow.  The  inverse 
scattering  experiment  is  illustrated  in  Fig.  1. 

We  now  quickly  review  two  applications  of  this  problem.  First,  consider  the  problem 
of  reconstructing  a  3-D  inhomogeneous  acoustic  medium  whose  density  p(x,  z)  and  wave 
speed  c(x,  2)  are  smooth  functions  of  depth  2  and  lateral  position  x.  The  medium  is 
bounded  by  a  free  (pressure- release)  surface  2  =  0.  The  density  po  and  wave  speed  cq  for 
2  <  0  and  z  —*  00  are  known.  The  medium  is  probed  with  cylindrical  harmonic  waves, 
at  two  frequencies  and  wj,  from  a  harmonic  line  source  extending  along  the  x-axis, 
and  the  sinusoidal  steady-state  vertical  acceleration  a{x,y,z  —  0;a;,)  of  the  medium  at 
the  free  surface  2  =  0  is  measured.  The  goal  is  to  reconstruct  p(x,  2)  and  c(x,  2)  from  the 
measurements  a(x,y,z  =  0;a;j),t  =  1,2. 

This  problem  can  be  formulated  as  a  2-D  Schrodinger  equation  inverse  potential  prob¬ 
lem  by  Fourier  transforming  the  basic  acoustic  equations  with  respect  to  time  and  the  other 
lateral  variable  y.  Details  are  given  in  both  [1]  and  [4].  Here  we  merely  note  that  in  the 
Schrodinger  equation  (2.1)  the  wave  field  p(x,z,k)  is  pressure  divided  by  /)(x,2)^/^,  the 
wavenumber  —  ky,  and  the  potential  V{x,z',uji)  is 


F(x,2;u;<)=  (l  -  ,  )  +p(j,z)^/^V^(p(x,2) 


It  is  clear  that  performing  this  experiment  for  two  different  frequencies  =  1,2  will 
allow  p{x,z)  and  c(x,z)  to  be  computed  from  (2.2).  The  wave  field  is  zero  at  the  free 
siirface  2  =  0;  its  gradient  is  the  medivun  acceleration  p(x,  0)^/^a(x,y,  z  =  0;u;i),i  =  1,2. 

The  second  application  is  the  inverse  resistivity  problem  of  reconstructing  a  3-D  in¬ 
homogeneous  electrical  medium  whose  resistivity  p{x,z)  is  a  smooth  function  of  x  eind  2 
over  a  boimded  region.  The  medium  is  probed  with  current  from  a  line  DC  current  source 
extending  edong  the  x-axis,  and  the  electrical  potential  v{x,  y,  2  =  0)  induced  on  the  sxn:- 
face  2  =  0,  assumed  to  be  a  perfect  insulator,  is  measured.  The  goal  is  to  reconstruct  the 
resistivity  p{x,z)  from  the  measurements  of  electrical  potential  v(x,y,  2  =  0).  Note  that 
for  both  applications,  the  response  to  a  line  source  may  be  foimd  by  superposition  of  the 
responses  due  to  point  sources  along  the  x-axis. 

This  problem  can  be  formulated  as  a  2-D  Schrodinger  equation  inverse  potential 
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problem  by  Fourier  transforming  Ohm’s  and  KirchofF’s  current  laws  with  respect  to  the 
other  lateral  variable  y.  Details  eire  given  in  [5].  Here  we  merely  note  that  in  the 
Schrodinger  equation  (2.1)  the  wave  field  p(x,z,k)  is  now  the  inverse  Laplace  transform 
of  the  Fourier  transform  of  electrical  potential  divided  by  p{x,zy^'^,  and  the  scattering 
potential  V{x,z)  =  p(x,  2:)^/^  V^(p(x, 

2.2  The  2-D  Layer  Stripping  Algorithm 

The  layer  stripping  algorithm  for  solving  the  2-D  Schrodinger  equation  inverse  poten¬ 
tial  problem  is  derived  as  follows  [4].  Taking  the  inverse  Fourier  transform  of  (2.1)  with 
respect  to  k  yields 

where 

P{x,z,t)  =  -^J  p{x,z,k)e'’‘^dk.  (2.4) 

Eq.  (2.3)  can  be  written  as  the  coupled  system 

(^■^  +  -^^p{x,z,t)  =  q{x,z,t)  (2.5a) 

^  P{x,z,t)  (2.56) 

From  causality  and  the  form  of  (2.5a),  p{x.,z,t)  and  q{x,z,t)  have  the  forms 

p{x,  z,  t)  =  6(t  —  z)  +  p{x,  z,  t)l{t  —  2)  (2.6a) 

q{x,  2,  t)  =  q{x,  2,  <)!(<  -  2),  (2.66) 

where  p  and  q  are  the  smooth  parts  of  p  and  q,  respectively,  and  !(•)  is  the  unit  step  or 
Heaviside  function. 

Inserting  (2.6)  in  (2.5)  and  equating  coefficients  of  S(t—z)  (propagation  of  singularities 
argument)  yields 

(^  +  ^)^ar,2r,t)  =  9(x,2,t)  (2.7a) 
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(2.76) 


V(x,z)  =  —2q(x,z,t  =  z'^).  (2.7c) 

Equations  (2.7)  constitute  the  basic  2-D  layer  stripping  equations:  Starting  with  measured 
p(x,  0,  t)  and  q{x,  0,  t)  (the  gradient  of  the  wave  field  is  required  for  the  latter),  propagate 
(2.7)  recursively  in  increasing  depth  2,  reconstructing  V(x,z)  as  the  algorithm  proceeds. 
The  coupled  equations  (2.7a)  and  (2.7b)  include  all  multiple  scattering  and  diffraction 
effects,  since  they  are  equivalent  to  the  Schrodinger  equation  (2.1)  in  the  time  domain. 
The  potential  may  be  reconstructed  using  (2.7c)  since  (2.7)  is  implemented  at  the  wave 
front  t  =  z;  by  time  causality  there  has  been  no  time  for  multiple  scattering  to  occur  yet. 
Some  advantages  of  using  layer  stripping  algorithms  are  as  follows: 

1.  Only  backscattered  data  from  one  direction  of  probing  is  reqtdred.  Integral  equation 
methods  [3]  require  the  scattering  amplitude,  which  is  the  far-field  response  in  all 
directions  to  an  incident  impulsive  plane  wave  in  each  possible  direction.  In  the 
applications  noted  above,  this  is  xmrealistic;  it  also  nms  the  risk  of  inconsistent  data; 

2.  The  amount  of  computation  required  is  much  less  than  the  amount  reqmred  to  solve 
the  integral  equations  of  [3].  The  layer  stripping  algorithm  can  be  viewed  as  a  fast 
algorithm  solution  of  these  integral  equations  which  exploits  the  Hankel  structure  in 
the  kernel  of  the  generalized  Marchenko  integral  equation  of  [3]; 

3.  All  multiple  scattering  and  diffraction  effects  axe  included,  tmlike  methods  such  as 
distorted- wave  Bom  approximation  which  only  account  for  some  of  these  effects. 

Two  disadvantages  of  layer  stripping  algorithms  are  as  follows: 

1.  It  is  not  clear  how  to  incorporate  the  effects  of  boimd  states  (roughly,  square-integrable 
solutions  to  the  Schrodinger  equation  with  negative  energy);  unlike  the  approach  of 

[3]; 

2.  The  lateral  derivative  ^  in  (2.7b)  can  be  expected  to  induce  ntimerical  instability. 
2.S  Numerical  Implementation  of  the  2-D  Layer  Stripping  Algorithm 

The  second  disadvantage  can  be  removed  as  follows.  Take  the  Fourier  transform  of 
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(2.7)  with  respect  to  x.  The  result  is 


d  5  \ 


(2.8a) 


(I  - 1) 

V{2,k^)  =  -2q{z,t  =  z+,k^) 
where  *  denotes  convolution  in  fc^, 

/°° 

p(x,z,<)e“‘''**dx 

“OO 


(2.86) 


(2.8c) 


(2.9) 


and  q(z,t,  kx)  and  V(z,  kx)  are  defined  similarly. 

The  multiplication  by  k^  in  (2.8b)  will  induce  numerical  instability.  This  may  be 
avoided  by  replacing  the  multiplication  by  k],  in  (2.8b)  with  mvJtiplication  by  the  clipped 
filter 

for  some  cutoff  wavenumber  K.  This  is  reminiscent  of  the  clipped  filter  used  in  the  filtered 
back-projection  procedure  for  inverting  the  Radon  transform.  In  practice,  the  disconti¬ 
nuities  in  (2.10)  at  \kx\  =  K  woxild  be  replaced  by  a  smooth  window  to  zero;  a  Hanning 
(raised  cosine)  window  was  used  in  the  numerical  simulations  presented  later. 

We  now  discretize  depth  z  =  nA  and  time  t  =  jA  to  integer  multiples  of  some 
discretization  length  A.  Since  the  wave  speed  in  (2.1)  is  unity,  depth  and  time  have  the 
same  A.  Lateral  position  x  would  also  use  the  seune  A;  but  wavenumber  A:*  =  ArAjfc  must 
use  for  Ajt  half  the  reciprocal  of  the  total  lateral  extent  of  interest;  e.g.,  if  the  potential  has 
finite  support  — Lr/2  <  x  <  L*/2  in  x,  L*  would  be  the  lateral  extent  of  interest.  Note 
that  A  and  A*  have  reciprocal  units. 

Using  forward  difference  approximations  to  the  partial  derivatives  then  yields 


p((n  -t-  1)A,  (j  -f  1)A,  ArAfc)  =  p(nA,i A,  fcAjk)  -I-  g(nA,  j  A,  fcAit)A 


(2.11a) 


q{{n  -I-  1)A,  (;  -  1)A,  fcAjk)  =  g(nA,iA,  k^k)  +  H{kAk)p(nA,jA,  kAk)A 


(2.116) 


+  ^  V((k  -  m)Ak)p(nA,jA,mAk)^Ak 

m=  — oo 

y((n  +  1)A,  kAk)  =  -2q{{n  +  1)A,  (n  +  1)A,  kAk).  (2.11c) 

Equations  (2.11)  constitute  the  numericcd  implementation  of  the  2-D  layer  stripping  algo¬ 
rithm.  The  update  patterns  are  illustrated  in  Figs.  2;  note  that  by  time  causality  p{z,  t,  kx) 
and  q{z,t,  kx)  are  zero  for  t  <  z. 

Cheney  [6]  has  shown  that  the  modification  (2.10)  stabilizes  the  layer  stripping  algo¬ 
rithm  (2.11),  in  the  following  sense.  Define  the  norm 

ll/(^*,i)||  =  sup  /  \fikx,t)\dkx.  (2.12) 

oo  J-oo 

Input  two  different  sets  of  bounded  initial  data  p,(fci,  0,  t),  qi{kx,0,  t),  i  =  1, 2  into  the  dis¬ 
cretized  algorithm  (2.11),  resulting  in  two  different  reconstructed  potentials  Vi{kx,z),i  = 
1,2.  Let  ||pj(A:i,0,<)l|  <  K'  and  ||5,(fci,0,<)|l  <  K'  for  some  K'.  Then  for  2  =  nA  we 
have 

s\xp\y\{kx-,z)—V2{kx,z)\  <  jK’i(2)||pi(A:j;,0,t)— P2(^x,0,t)l|4-iir2(^^)l!9i(^r,0,t)— 52(^1  »0,t)|l, 

**  (2.13) 

where  Ki{z)  and  K^iz)  are  polynomials  in  n.  A,  K,  and  K'. 

The  discretized  system  (2.11)  can  be  implemented  as  is.  However,  its  spectral  proper¬ 
ties  axe  worth  examining.  It  might  seem  as  though  we  can  regard  the  discretized  functions 
p(nA,jA,kA),  etc.  as  merely  szimpled  versions  of  the  continuous  functions  p(z,t,kx), 
etc.,  provided  the  latter  are  bandlimited  zind  sampling  is  performed  above  the  Nyquist 
rate.  However,  the  nonlinear  product  in  (2.7b)  becomes  the  convolution  in  kx  in  (2.8b) 
and  (2.11b);  the  wavenumbers  become  mixed.  Indeed,  even  if  the  inverse  potential  problem 
is  regularized  by  eissuming  that  V{z,  kx)  is  bandlimited  in  2  and  zero  for  |fc*l  >  K  for  some 
K,  it  is  clear  that  p{z,t,kx),  etc.  will  NOT  have  similar  properties.  Imposing  a  bjindlim- 
ited  condition  at  each  recxirsion  will  lead  to  errors,  since  the  missing  high  wavenumbers 
will  cause  errors  at  low  waventimbers  due  to  the  wavenumber  mixing.  This  leads  to  the 
question  of  what  the  discretized  p(nA,jA,  fcA*),  etc.  meem,  and  how  the  convolutions  in 
kx  should  be  performed.  It  should  be  noted  that  similar  questions  arise  in  integral  equation 
methods. 


One  possible  interpretation  is  to  perform  a  periodic  extension  in  k  of  all  quantities  in 
(2.11).  The  period  in  k  should  be  l/At;  K  in  (2.10)  should  then  be  half  this.  It  is  clear 
by  induction  that  if  all  quantities  at  depth  nA  are  periodic  in  k,  then  all  quantities  at 
depth  (n  +  1)A  will  also  be  periodic  in  k.  This  has  two  advantages:  (1)  the  infinite  linear 
convolution  becomes  a  finite  cychc  convolution;  and  (2)  the  discrete  Fourier  transform 
may  be  used  to  perform  all  Fourier  transforms.  Since  periodicity  in  one  Fomier  domain 
is  equivalent  to  discreteness  in  the  other  Fourier  domain,  the  problem  has  effectively  been 
discretized  laterally  as  well  as  vertically:  the  quantites  propagated  in  (2.11)  are  not  samples 
of  a  bandlimited  function,  but  actual  discrete  values.  As  Afc  —*■  0,  the  situation  approaches 
the  continuous  problem. 


2-4  Bom  Approximation  to  the  Layer  Stripping  Algorithm 


It  is  worth  noting  how  the  Bom  approximation  applies  to  the  layer  stripping  equations 
(2.7).  The  Born  approximation  is  a  linearization  of  the  inverse  potential  problem;  the  idea 
is  to  render  the  potential  to  be  linearly  related  to  the  scattering  data.  This  has  been 
discussed  in  detail  elsewhere;  here  we  merely  scale  the  potential  by  a  small  parameter  e, 
expand  p(r,  z,t),  etc.  in  a  Taylor  series  in  e,  and  discard  all  terms  of  order  or  smaller. 
The  resiilt  is  elimination  of  the  product  in  (2.7b);  since  this  is  the  one  nonlinearity  in  (2.7) 
its  elimination  is  not  surprising.  Combining  the  modified  (2.7a)  and  (2.7b)  and  keeping 
(2.7c)  results  in 

V{x,  z)  =  —2q{x,  z,t  —  z"^).  (2.146) 


We  recognize  (2.14a)  as  the  migration  operator  relating  the  wave  field  at  the  surface  z  =  0 
to  the  wave  field  on  the  plane  parallel  to  the  surface  at  depth  z,  and  (2.14b)  as  the  imaging 
operator  (gradient)  applied  to  the  migrated  wave  field.  Taking  two  Foiirier  transforms  of 
(2.14a)  with  respect  to  t  and  x  and  using  (2.4)  and  (2.9)  yields 

+  (*^  -  ?(**>  =  0;  {g(x,  z, t)}  (2.15) 

a  differential  equation  which  has  the  solution 

=  (2.16) 
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The  operation  of  the  Bom  approximation  to  the  layer  stripping  equations  is  now  clear:  (1) 
migrate  the  wave  field  from  the  siirface  to  depth  2;  and  (2)  image  the  wave  field  at  depth 
z  to  obtain  the  scattering  potential.  Note  that  imaging  the  potential  requires  taking  the 
gradient  of  the  wave  field;  this  is  why  q,  not  p,  is  used.  Note  also  that  multiple  scattering, 
which  is  inherently  nonlinear,  is  neglected  in  (2.14)  and  (2.16).  The  coupling  induced  by 
the  product  term  in  (2.7)  accounts  precisely  for  all  multiple  scattering.  More  details  about 
the  Born  approximation  and  its  relation  to  layer  stripping  and  integral  equation  methods 
is  available  in  [4]  and  [7]. 

From  the  Schrodinger  equation  (2.1),  it  is  apparent  that  for  large  wavenumbers  k  the 
potential  V{x^z)  will  be  relatively  small,  and  that  multiple  scattering  will  be  less  signifi¬ 
cant.  Indeed,  in  the  limit  k  —*  cx)  the  Bora  approximation  becomes  exact,  in  that  multiple 
scattering  effects  become  negligible.  However,  inversion  based  solely  on  asymptotically 
large  k  is  clearly  unstable;  “exact”  inverse  scattering  methods  use  low-wavenumber  data 
as  well  as  high-wavenumber  data  to  stabilize  the  reconstruction.  Also,  it  is  clear  that 
multiple  scattering  is  more  significant  for  small  k  (^(x,^)  is  relatively  large),  so  lack  of 
high-waventunber  data  makes  the  use  of  “exact”  methods  even  more  imperative. 

3.  Forward  Problem  Algorithm 

S.l  Invariant  Imbedding  Algorithm 

The  invariant  imbedding  algorithm  of  [5]  was  used  to  generate  the  scattering  data, 
to  be  input  into  the  layer  stripping  algorithm.  We  briefly  review  this  algorithm  here, 
following  the  notation  of  [5]  for  convenience.  Let  k  be  waveniimber,  as  in  the  Schrodinger 
equation  (2.1),  q  =  k^  (lateriil  wavenximber),  k{q)  =  y/k^  ~  9^  (vertical  wavenumber,  as 
in  (2.16)),  and  p  be  lateral  wavenumber  of  the  incident  plane  wave  (iJtimately  we  are 
interested  in  p  =  0).  Then  further  define  h{z,q)  =  V[z,kx)  (scattering  potential)  and 
“(■2^*9)  =  P{^x^z,k)  (wave  field;  see  (2.15)).  A  slight  problem  with  the  notation  of  [5]  is 
that  the  dependence  of  u(z,q)  and  R{c,q,p)  on  k  is  not  explicit. 

Finally,  define  R{c,q,p)  as  the  near-field  planar  reflection  response,  in  direction  5,  of 
the  portion  of  the  medivun  below  depth  c,  to  an  impulse  ^(g  —  p)e“‘*(^^'/ib(p),  in  direction 


p  (recall  directions  axe  specified  by  wavenumbers).  Two  inverse  Fourier  transforms  taking 
k  t  and  q  =  x,  as  in  (2.15),  transform  S{q  —  into  the  impulsive  plane 

wave  6(t  —  2  cos  9  —  x  sin  9),  where  9  is  the  angle  of  incidence  (measured  from  the  vertical) 
defined  from  p  =  fcsin^.  Hence  k(p)R{Q,q,0),  computed  for  each  k  and  then  inverse 
Fourier  transformed  as  in  (2.4),  is  precisely  the  reflection  response  to  an  impulsive  plane 
wave  normally  incident  on  the  medium. 

We  sketch  through  the  derivation  of  the  invarieint  imbedding  equations  to  show  the 
similarities  and  differences  to  layer  stripping.  A  Fourier  transform  of  the  Schrodinger 
equation  (2.1)  taking  x  —*  q  —  kx  yields  (recall  h{z,q)  =  V{z,kx)) 

+  k^(q)  -  h{z,  q)^  u(z,  q)  =  0,  (3.1) 

where  *  denotes  convolution  in  q  and  k'^(q)  =  k^  —  q^.  Defining 

+  ^Se)/2' 

it  can  be  shown  ([5],  p.93)  that  v(z,q)  and  w(z,q)  satisfy  the  coupled  system  (compare  to 
(2-8))  ... 


d 

V 

ik{q)v  —  {h*(v  +  w))f{2ik{q)) 

dz 

IV 

—ik(q)w  +  {h*  (v  +  w))/(2ik{q)) 

where  all  variables  are  functions  of  (z,?). 

Now  imbed  the  system  (3.3)  eis  follows.  Let  A{z,c,q,p)  and  B{z,c,q,p)  satisfy  (3.3), 
initialized  with  A{z  =  c,c,q,p)  =  6{q  —  p)lk{p)  and  B{z  =  L,c,q,p)  =  0  (the  latter  is  a 
radiation  condition;  recall  V{x^z)  has  support  in  0  <  z  <  L).  Then 


u(2,g)  =  kA(z,0,q,py,w{z,q)  =  fcS(z,0,g,p); 


24(c,  c,  q,  p)  =  6(q  -  p)fk{py,  R{c,  q,  p)  =  B{c,  c,  q,  p) 

Furthermore,  ^  and  ^  also  satisfy  (3.3),  but  with  initieil  conditions 

dA  dA  dB 

-^ic,c,q,p)  =  — ^(c,c,g,p);  -^{L,c,q,p)  =  0. 

By  superposition,  the  solution  to  (3.3)  with  these  initial  conditions  is 


dA 

dc 


(3.4) 


(3.5) 


/dA 

A.{z,  c,  q,  q')k{q’)—{c,  c,  q' ,p)dq'  {Z.6a) 
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^(z,c,g,p)  =  -  J  B(z,c,q,q')k(q')^(c,c,q',p)dq'.  (3.66) 

([5],p.95).  We  also  have  from  the  last  of  (3.4) 

-^{c,q,p)  =  —{c,c,q,p)  =  — (c,c,9,p)  +  —{c,c,q,p).  (3.7) 

Finally,  setting  2  =  c  in  (3.3),  substituting  into  (3.6),  and  substituting  again  into  (3.7) 
gives  the  following  invariant  imbedding  equation  for  R(c,q,p): 

i^{c,q,p)  =  ik+k(q))R(c,q,p)+h(c,q-p)/i2k(q)k)+ j  j  R{c,q,q')h(c,q'-q")R{c,q" ,p)/2dq' dq'' 

+  J{h{c,q-q)R{c,q',p)/k{q)  +  R(c,q,q')h(c,q' -p)fk)/2dq';  R{L,q,p)  =  Q.  (3.8) 

This  is  (11a)  on  p.  97  of  [5]. 

Note  that  (3.8)  is  computed  recursively  in  decreasing  c,  starting  at  c  =  L  and  ending 
at  c  =  0.  This  must  be  done  for  each  p,  q,  and  k  (recall  that  R{c,  q,  p)  also  depends  on 
Jb;  this  dependence  is  not  shown  explicitly  in  (3.8)  since  none  of  the  integrations  are  over 
k).  Having  computed  R(0,q,p)  for  all  k,  i.e.,  having  computed  R{0,q,p,k),  the  inverse 
Fourier  transform  (2.4)  of  kR{0,q,0,  k)  is  precisely  the  reflection  response  to  an  impulsive 
plane  wave  normally  incident  on  the  medium.  This  is  the  scattering  data  used  as  input  to 
the  layer  stripping  algorithm. 

S.2  Numerical  Implementation  of  Invariant  Imbedding  Algorithm 

Despite  its  apparent  complexity,  (3.8)  can  be  implemented  numerically  in  a  straight¬ 
forward  manner  by  discretization  similar  to  that  used  to  obtain  (2.11)  from  (2.8).  Since 
(3.8)  is  already  in  the  wavenumber  domain,  emd  the  scattering  potentieJ  h{z,q)  is  known 
exactly,  no  computational  instability  issues  arise.  The  integrals  may  be  evaluated  using 
the  trapezoidal  rule,  and  a  backward  difference  approximation  to  ^  used  to  propagate 
(3.8)  in  decreasing  c  from  c  =  X  to  c  =  0. 

Once  again  we  assume  a  periodicity  of  1/ A  in  the  values  of  all  functions  of  wavenum¬ 
bers;  this  corresponds  to  the  discretized  functions  being  actual  discrete  values,  rather 
than  sampled  values  of  bandlimited  continuous  functions.  The  inflnite  integrals  in  (3.6) 
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and  (3.8)  become  cyclic  integrals  (computed  only  over  one  period),  so  their  evaluation  is 
straightforward.  The  multiplication  hy  k  +  k{q)  in  (3.8)  is  windowed  to  zero  for  values 
greater  than  1/(2A),  eis  in  (2.10),  and  then  periodically  extended. 

Note  that  it  is  not  possible  to  compute  the  reflection  response  for  k  =  0  or  k{q)  =  0, 
due  to  the  divisions  by  these  in  (3.8).  The  former  can  be  assumed  to  be  zero,  since  a 
non-zero  DC  reflection  response  would  represent  permanent  displacement  resulting  from 
the  impulsive  plane  wave!  The  latter  corresponds  to  incidence  at  90  degrees,  which  would 
not  create  a  backscattered  field  in  the  -\-z  direction.  Hence  omitting  these  does  not  present 
a  problem. 

S.S  Bom  Approximation  to  Invariant  Imbedding  Algorithm 

The  invariant  imbedding  equation  (3.8)  is  suggestive  of  a  2-D  version  of  the  Riccati 
equation  familiar  in  1-D  scattering  in  layered  media.  The  two  integral  terms  correspond 
to  the  square  term  in  the  1-D  Riccati  equation.  To  aid  in  imderstanding  (3.8),  we  now 
apply  the  Born  approximation  to  (3.8),  and  show  that  the  Bom  approximation  to  the 
layer  stripping  algorithm  (2.14b)  and  (2.16)  reconstructs  the  potential  from  the  reflection 
response  generated  by  the  Bora  approximation  to  the  invariant  imbedding  equation  (3.8). 

As  in  Section  2.4,  we  scale  the  potential  by  a  small  parameter  e,  expand  the  wave  field 
and  reflection  response  in  a  Taylor  series  in  e,  and  discard  all  terms  of  order  or  smaller. 
The  result  is  elimination  of  the  two  integrals  of  products  terms  in  (3.8),  leaving 

i^(c,q,p)  =  {k  +  k{q))R(c,q,p)  +  h{c,q  -  p)/{2k{q)ky,  R{L,q,p)  =0.  (3.9) 

Since  there  is  no  longer  coupling  between  R(c,  q,  p)  of  different  p,  we  can  set  p  =  0  (normal 
incidence)  and  solve  the  differential  equation  (3.9),  yielding 

kR{z,  q,  k)  =  (3.10) 

The  factor  of  k  multiplying  R(z,q,k)  is  present  because  k{p)R{z,q,p,k)  is  the  Fourier 
transform  of  the  reflection  response  to  an  impulse,  as  discussed  in  the  second  paragraph 
of  Section  3.1.  Since  p  =  0  here,  we  have  k{0)  =  k,  so  kR{z,q,k)  is  the  frequency-domain 
reflection  response  to  a  planar  impulse. 
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Equation  (3.10)  has  a  very  clear  interpretation:  to  form  the  reflection  response  at 
depth  z  in  the  Bom  approximation,  assume  the  incident  impulsive  plane  wave  penetrates 
without  being  scattered  to  each  depth  z' ,  and  is  then  scattered  by  the  potential  V{z',  q)  at 
that  depth.  Then  use  the  migration  operator  e**^*^*  to  migrate  each  scattered  field  back 
to  depth  z  independently  (neglecting  all  coupling),  and  superpose  the  scattered  fields  due 
to  each  V{z' ,q).  At  the  surface  z  =  0  this  is  clear,  but  it  applies  to  any  depth  z. 

Note  that  ^(jb*,  z,  k)  in  (2.16)  in  the  time  domain  is  causal  for  all  z  (see  Fig.  2b)  while 
a  time  delay/advance  must  be  included  in  (3.10).  Also  note  that  V{z',q)  in  (3.10) 

is  scaled  by  —i/(2k(q))‘,  the  reason  for  this  will  become  apparent  in  (3.13)  below. 

Now  consider  the  Bora  approximation  to  the  layer  stripping  algorithm  (2.14b)  and 
(2.16)  applied  to  (3.10).  Taking  the  Fourier  transform  (2.4)  of  (2.8a)  and  using  (2.16)  gives 

q(z,q,k)=  +ik^  p{z,q,k)  =  +  ik^  R{z,q,k)e*^^^^\  (3.11) 

Inserting  (3.10)  into  (3.11)  shows  that  the  Bom-approximated  layer  stripping  algorithm 
computes 

q{z,  q,  k)  =  -i/(2k(q))Viz,  (3.12) 

from  the  Bora-approximated  scattering  data.  Using  (2.14b)  shows  that  the  Bora- approximated 
layer  stripping  algorithm  computes 


^ik{q)z 

=  VM  (3-13) 

so  that  it  does  indeed  correctly  compute  the  scattering  potential  V{z,q)  in  the  Bora 
approximation. 

4.  Numerical  Results 

4.1  Initialization 

The  algorithm  described  in  Section  3  was  used  to  generate  the  backscattered  reflection 

responsefcR(0, 9,  A:)  to  an  impulsive  pleme  wave  for  several  different  scattering  potentials 


V(x,z).  The  inverse  Foiirier  transform  (2.4)  of  kR(0,q,k)  was  then  used  to 

initialize  the  discrete  layer  stripping  algorithm  of  Section  2,  with  (recall  q  =  kx) 

Q 

p(0,t,kx)  =  R{0,q  =  kx,ty,  q{0,t,kx)  =  2—R(0,q  =  kx,t).  (4.1) 

The  latter  initial  condition  comes  from  (2.7a)  and  the  fact  that  R{z,q,t)  =  R(0,q,t  +  z) 
in  the  homogeneous  overlying  half-space  z  <  0,  since  R(0,q,t)  is  a  backscattered  (i.e., 
upward- traveling)  wave.  Note  that  the  sample  applications  of  Section  2  would  require 
different  initial  conditions. 

Jf.2  Forward  Problem  vs.  Inverse  Problem  Algorithms 

The  invariant  imbedding  algorithm  was  used  to  generate  the  forward  data  so  that  the 
layer  stripping  inverse  problem  algorithm  would  not  simply  run  the  computations  of  the 
forward  problem  algorithm  backwards.  Although  the  two  algorithms  must  of  course  be 
mathematically  equivalent,  since  they  are  both  “exact,”  they  are  derived  from  different 
mathematical  principles. 

Some  specific  differences  between  the  forward  problem  (invariant  imbedding)  algo¬ 
rithm  (FPA)  and  the  inverse  problem  (layer  stripping)  algorithm  (IPA)  are  as  follows: 

1.  The  FPA  propagates  the  reflection  coefficient  at  depth  R(c,q,p,k).  The  IPA  propa¬ 
gates  the  field  and  fleld  gradient  p{z,t,kx)  and  q{z,t,kx).  Note  that  R  ^  g/p,  since 
R  is  the  ratio  of  downgoing  and  upgoing  waves,  not  fleld  quantities; 

2.  The  FPA  operates  in  the  k  (frequency)  domain,  while  the  IPA  operates  in  the  t  (time) 
domain; 

3.  The  FPA  computes  R{c,q,p,k)  for  all  c,q,p,k,  while  the  IPA  is  initialized  using 
kR{0,  q,  0,  k),  a  slice  of  the  FPA  function.  Note  in  the  FPA  (3.8)  the  integrals  over  q' 
and  the  differences  q'  —  q";  these  clearly  have  no  counterpart  in  the  IPA; 

4.  The  FPA  propagates  (3.8),  which  can  be  viewed  as  a  2-D  generalization  of  the  Riccati 
equation  familiar  in  1-D  inverse  scattering.  The  IPA  propagates  the  coupled  system 
(2.8);  note  that  this  differs  from  the  coupled  system  (3.3)  used  to  derive  (3.8); 

5.  While  both  algorithms  eire  discretized  in  depth,  the  FBP  results  did  not  vary  signif¬ 
icantly  with  mesh  size,  so  this  should  not  be  an  issue.  The  IBP  results  also  did  not 
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vary  significantly  with  mesh  size,  and  gave  good  results  at  several  different  resolutions 
(see  below); 

These  differences  mahe  it  clear  that  errors  are  not  cancelling  out  algebraically  between 
the  FPA  and  the  IPA,  i.e.,  the  IPA  is  not  effectively  running  the  FPA  computations 
backwards. 

4.3  Summary  of  Results 

The  numerical  performance  of  the  layer  stripping  algorithm  was  studied  imder  a  va¬ 
riety  of  conditions.  The  results  may  be  siimmarized  as  follows; 

1.  The  layer  stripping  algorithm  successfully  reconstructed  the  potential  in  the  absence 
of  noise.  The  only  difficulty  was  due  to  the  smoothing  of  the  transverse  derivative, 
which  slightly  smoothed  very  sharp  variations  in  the  lateral  direction; 

2.  The  layer  stripping  algorithm  continued  to  work  well  when  a  small  amount  of  Gaussian 
random  noise  was  added  to  the  reflection  response.  The  reconstructed  potential  was 
slightly  degraded,  of  course,  but  the  amount  of  degradation  seemed  to  vary  smoothly 
with  the  amoxmt  of  noise-a  slight  increase  in  noise  level  did  not  vastly  degrade  the 
reconstructed  potential; 

3.  The  layer  stripping  algorithm  reconstructions  were  superior  to  those  using  the  Born 
approximation  (as  specified  in  Section  2.4  above),  in  that  the  Bom  approximation 
treated  multiple  scattering  events  as  additional  single  scattering  events,  resulting  in 
errors  in  the  reconstmcted  potentieil,  partictdarly  for  leirge  z.  This  effect  was  more 
pronoimced  when  the  potential  had  numerically  large  values;  for  small  potentials 
V{nA,  kA)A  <<  1  the  Bom  approximation  worked  quite  well.  This  was  as  expected; 
multiple  scattering  involves  products  of  potentials,  and  multiplying  small  values  results 
in  even  smaller  values; 

4.  The  performzmce  of  the  algorithm  seemed  to  vary  little  with  the  size  of  the  discretiza¬ 
tion  length  A,  provided  that  the  same  A  was  used  in  the  discretized  invariant  imbed¬ 
ding  algorithm.  This  suggests  there  may  be  a  close  relation  between  the  discretized 
versions  of  these  algorithms.  Coarse  grid  reconstmctions  seemed  to  be  merely  under- 


sampled  versions  of  the  fine  grid  reconstructions;  the  basic  features  of  the  reconstruc¬ 
tions  were  identical. 

We  illustrate  these  points  with  some  numerical  examples  below.  It  should  be  noted 
that  the  following  is  only  a  representative  and  illustrative  sample  of  our  results;  the  above 
conclusions  axe  not  based  merely  on  the  results  below.  Unless  otherwise  specified,  all 
examples  used  A  =  1/32,  L  =  Lx  =  1/2,  and  =  1.  The  3-D  plots  are  depicting 
2-D  functions  V’(x,  z)-,  they  do  not  represent  objects  buried  in  a  homogeneous  surrounding 
medium. 


4.4  Comparison  with  Bom  Approximation 

The  potentizd  V{x,z)  is  shown  in  Fig.  3a.  Note  that  this  is  a  smooth,  rounded 
potential  having  compact  support  in  both  x  £ind  z. 

The  reconstructed  potential  using  the  Bom  approximation  is  shown  in  Fig.  3b.  Al¬ 
though  Fig.  3b  superficially  seems  to  be  identical  to  Fig.  3a,  study  carefully  the  deepest 
part  of  the  reconstructed  V{x,z).  The  originzd  V’(x,z)  is  zero  for  z  >  24/32,  while  the 
Born-reconstructed  V{x,z)  does  not  become  zero  tmtil  z  >  26/32;  it  has  a  “tail.”  This 
“tail”  is  caused  by  multiple  scattering  that  is  interpreted  imder  the  Bom  approximation 
as  primary  scattering  due  to  an  additional  non-zero  portion  of  the  scattering  potential; 
actually,  there  is  no  such  portion. 

The  reconstmcted  potential  using  the  layer  stripping  algorithm  is  shown  in  Fig.  3c. 
This  reconstruction  has  no  “tail”;  the  multiple  scattering  that  produces  it  has  been  ac- 
cotmted  for  in  the  algorithm  and  eliminated.  The  reconstmction  is  almost  perfect. 

A  different  potential  is  shown  in  Fig.  4a.  Note  that  this  potential  function  is  constant 
over  a  central  “plateau,”  and  then  drops  off  rapidly  to  zero. 

The  reconstmcted  potential  using  the  Bom  approximation  is  shown  in  Fig.  4b.  Note 
again  the  presence  of  a  “tail”  at  its  deepest  part,  while  there  is  no  “tail”  at  its  shallowest 
part,  since  multiple  scattering  hM  not  yet  had  time  to  occur  in  this  part  of  the  time- 
domain  impulse  response  (the  lack  of  symmetry  in  z  is  apparent  if  one  looks  at  the  figure 
as  a  whole).  Also  note  the  problems  in  reconstmcting  the  lateral  edges  of  the  potential 


17 


function;  the  central  “plateau”  is  much  smaller  than  it  should  be. 

The  reconstructed  potential  using  the  layer  stripping  algorithm  is  shown  in  Fig.  4c. 
Again  the  “tail”  caused  by  multiple  scattering  has  been  eliminated.  However,  the  shal¬ 
lowest  and  deepest  edges  of  the  “plateau”  have  been  rounded  off  slightly.  Since  this  is 
symmetric  between  the  shallowest  and  deepest  parts,  it  is  not  due  to  multiple  scattering. 
We  attribute  it  to  smoothing  in  the  transverse  derivative. 

J^.5  Effects  of  Additive  Noise 

The  potential  V(x,z)  used  in  Figs.  3  was  scaled  as  shown  in  Fig.  5a,  and  Gaussian 
random  noise  was  added  to  the  reflection  response  ii(0,  q,  k).  The  signaJ-to-noise  ratio, 
computed  as  the  square  root  of  the  sum  of  the  squares  of  the  discrete  signal  values  divided 
by  the  square  root  of  the  sum  of  the  squares  of  the  discrete  noise  values,  was  found  to 
be  36  dB  for  one  nm  and  18  dB  for  another  (to  get  power  SNR  these  values  should  be 
doubled).  Note  that  any  powers  of  A  and  numbers  of  points  being  averaged  will  cancel  in 
this  ratio. 

The  reconstructions  at  36  dB  are  virtually  perfect;  in  fact,  the  reconstructions  shown 
in  Figs.  3  are  actually  these  reconstructions.  The  reconstructions  at  18  dB  are  shown  in 
Fig.  5b  using  the  Bom  approximation  and  Fig.  5c  using  the  layer  stripping  algorithm. 
Note  that  even  in  these  noisy  reconstmctions  the  “tail”  is  still  a  significeint  feature  in  the 
Bom  approximation  reconstmction,  while  the  layer  stripping  reconstmction  has  correctly 
removed  the  “tail.” 

To  see  the  degradation  of  the  layer  stripping  algorithm  in  the  presence  of  increasing 
amoimts  of  noise  added  to  the  reflection  response,  study  Figs.  6.  Fig.  6a  shows  the  original 
potential  function,  which  is  the  same  as  Fig.  4a.  Fig.  6b  shows  a  noisy  reconstmction  of 
the  potential  function  shown  in  Fig.  6a,  and  in  Fig.  6c  the  signal-to-noise  ratio  has  been 
reduced  by  a  factor  of  foiu:.  The  increasing  degreuiation  of  the  reconstmction  is  obvious, 
but  the  layer  stripping  sdgorithm  does  not  feill  apart  even  in  large  amoimts  of  additive 
noise. 

A  similar  study  is  carried  out  for  a  different  potential  function  in  Figs.  7.  Fig.  7a  shows 


the  originai  potential  function,  and  Figs.  7b  and  7c  correspond  to  Figs.  6b  and  6c.  The 
only  notable  feature  of  the  layer  stripping  reconstructions  is  the  slight  (one  pixel  wide) 
“shelf”  induced  by  the  smoothed  transverse  derivative  (this  is  discussed  in  more  detail 
below);  otherwise,  the  reconstructed  potential  smoothly  degrades  in  increasing  noise. 

Note  the  presence  of  the  sharp  ridge  along  the  line  z  =  0  in  Figs.  6  and  7.  This 
ridge  is  due  to  the  non-zero  mean  of  the  noise  being  added  to  i?(0,  q,  k).  When  V{x,  z)  is 
computed  by  taking  the  inverse  Fourier  transform  (2.9),  this  non-zero  mean,  a  constant 
in  the  Fourier  wavenumber  k  domain,  becomes  an  impulse  in  the  spatial  z  domain.  This 
imptilse  is  the  ridge. 

4.6  Discussion  of  Numerical  Stability  in  Noise 

The  smooth  degradation  of  the  reconstructed  potential  with  increasing  noise  levels 
might  seem  surprising,  since  the  inverse  scattering  problem  is  known  to  be  ill-conditioned. 
The  reason  for  this  is  that  multiple  scattering  has  a  relatively  small  (compared  to  single 
scattering)  effect,  so  that  the  Bom  approximation  result  will  be  approximately  the  same  as 
the  layer  stripping  result.  The  Bom  approximation  is  linear,  so  that  any  noise  added  to  the 
reflection  response  will  produce  an  addition  to  the  reconstmcted  potential  whose  strength 
is  directly  proportional  to  the  noise  strength  (heilving  the  noise  will  halve  the  addition); 
hence  the  Bom-reconstmcted  potential  will  degrade  smoothly,  and  it  is  not  surprising  that 
the  reconstructed  potential  from  layer  stripping  also  degrades  smoothly. 

This  heuristic  argument  should  not  be  teken  too  far;  in  the  1-D  case,  it  is  well  known 
that  large  noise  levels  can  cause  severe  problems  in  layer  stripping  algorithms,  and  indeed 
in  any  “exact”  method.  The  reason  for  this  is  NOT  numerical  instability,  as  is  commonly 
believed;  the  1-D  layer  stripping  algorithm  is  identical  to  the  Schur  edgorithm  (see  [8]), 
which  is  known  to  he  numerically  stable. 

The  reason  that  1-D  layer  stripping  algorithms  can  give  imstable  results  when  they 
are  applied  to  noisy  reflection  data  is  as  follows.  It  is  well  known  that  the  free-svuface 
reflection  response  of  a  1-D  layered  medium  to  an  impulsive  plane  wave  below  the  surface 
is  one  side  of  the  autocorrelation  of  its  transmission  response;  hence  it  must  be  positive 
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semi-definite.  Noise  added  to  the  refiection  response  can  mzike  the  two-sided  response  (the 
reflection  response  added  to  its  time  reversal)  become  non-positive-semi-definite,  in  which 
case  it  is  no  longer  the  reflection  response  to  any  layered  mediiim.  The  problem  is  now 
ill-posed,  in  the  sense  of  having  no  solution;  it  is  not  surprising  that  the  layer  stripping 
algorithms  become  unstable. 

However,  small  amounts  of  additive  noise  will  not  cause  the  reflection  response  to 
become  non-positive-semi-definite;  as  long  as  this  is  true,  the  layer  stripping  algorithms 
will  behave  well  numerically.  Our  resvdts  in  this  paper  suggest  that  a  a  similar  situation 
is  present  in  the  2-D  inverse  scattering  problem  considered  here;  this  is  a  topic  of  current 
research. 

4-7  Smoothed  Reconstructions  Due  to  Smoothed  Transverse  Derivative 

The  smoothing  in  the  transverse  derivative  incurred  by  using  the  clipped  filter  (2.10) 
causes  a  slight  but  noticeable  smoothing  of  V(x,z)  along  the  x  direction.  This  was  man¬ 
ifested  in  the  reconstructions  in  Figs.  7  by  the  “shelf”  that  appeared  at  the  ends  of  the 
reconstructed  potential  function.  Another  ex2imple  of  this  is  illustrated  in  Figs.  8.  Fig.  8a 
shows  the  original  potential  function,  which  was  produced  by  talcing  the  potential  function 
of  Fig.  7a  and  adding  random  noise  to  it.  The  idea  here  is  that  in  real  life  potential 
functions  will  not  have  simple  analytic  forms;  they  will  be  complicated  functions.  Hence 
Fig.  8a  is  closer  to  a  realistic  potentiad  function. 

The  reconstructed  potential  from  the  layer  stripping  algorithm  is  shown  in  Fig.  8b. 
Note  again  the  one-pixel-wide  “shelf’  at  each  of  the  two  flat  ends  of  the  potential  function. 
We  attribute  this  to  the  smoothing  of  the  transverse  derivative  in  the  layer  stripping 
algorithm;  unable  to  reconstruct  the  sharp  jump  from  zero,  the  algorithm  provides  a 
latereJly  smoothed  reconstruction  in  which  the  reconstructed  potential  takes  two  smaller 
lateral  jximps  instead  of  a  single  large  jump.  Note  that  the  “shelf”  is  half  the  height  of  the 
jximp  in  X  at  each  depth  z. 

Also  note  in  Fig.  8b  that  the  “noisy”  part  of  the  potential  in  Fig.  8a  hM  been  notice¬ 
ably  smoothed.  This  again  seems  to  be  due  to  the  smoothing  in  the  transverse  derivative; 
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note  that  the  reconstructed  potential  is  “rougher”  in  the  z  direction  (for  which  there  is 
no  smoothing)  than  in  the  i  direction  (in  which  there  is  smoothing).  This  smoothing 
effect  should  be  taken  into  consideration  in  potentials  reconstructed  using  layer  stripping 
algorithms. 

Ji-S  Effect  of  Discretization  Length  A 

The  above  numerical  rims  all  used  A  =  1/32.  Results  for  a  larger  A  =  1/16  are 
shown  in  Figs.  9.  Fig.  9a  shows  the  original  potential,  which  is  an  undersampled  version 
of  the  potential  in  Fig.  4a.  The  reconstructed  potential  using  the  Bom  approximation 
is  shown  in  Fig.  9b.  Note  how  poorly  the  Bom  approximation  reconstructs  the  central 
“plateau”  of  the  potential  function.  The  reconstructed  potential  using  the  layer  stripping 
algorithm  is  shown  in  Fig.  9c.  Although  the  central  “plateau”  is  reconstmcted  quite  well, 
the  potential  function  as  a  whole  is  spread  out  one  pixel  in  each  direction.  This  shows  that 
while  the  invariant  imbedding  and  layer  stripping  algorithms  are  clearly  closely  connected, 
the  discretized  layer  stripping  algorithm  is  NOT  merely  rrmning  the  invariant  imbedding 
algorithm  backwards. 

Results  for  a  smaller  A  =  1/64  are  shown  in  Figs.  10.  Fig.  10a  shows  the  original 
potential,  which  is  a  more  finely  sampled  version  of  the  potential  in  Fig.  4a.  The  recon¬ 
structed  potential  using  the  layer  stripping  algorithm  is  shown  in  Fig.  10b.  The  recon¬ 
struction  is  almost  perfect-even  the  lateral  smoothing  caused  by  the  smoothed  transverse 
derivative  is  not  apparent.  This  is  due  to  the  fact  that  although  A*  =  1,  the  maximum 
value  of  kx  is  now  32  instead  of  16;  the  smoothing  stmts  at  a  much  higher  wavenumber. 
A  very  close  comparison  of  Figs.  10a  emd  10b  show  that  the  reconstmction  is  not  quite 
perfect;  the  reconstmcted  potentizil  is  still  spread  out  one  pixel  in  each  direction.  But  this 
effect  is  virtually  negligible  on  this  scale. 

One  final  example  combines  a  smellier  A,  additive  noise  in  the  reflection  response, 
and  smoothed  reconstmction.  Fig.  11a  shows  the  original  potential,  which  is  a  more  finely 
sampled  version  of  the  potential  in  Fig.  7a.  Random  noise  was  added  to  the  reflection  data, 
at  a  signal- to-noise  ratio  of  15.7  dB.  The  reconstmcted  potential  using  the  layer  stripping 


algorithm  is  shown  in  Fig.  11b.  All  the  features  discussed  in  Section  4.3  are  again  present 
in  Fig.  11b.  These  include  the  “shelf,”  still  one  pixel  wide,  the  ridge  along  the  line  2  =  0, 
and  the  main  shape  of  the  potential  function  still  visible  in  the  noise.  This  shows  that 
these  effects  occur  at  different  discretization  lengths,  and  indeed  may  be  endemic  to  layer 
stripping  reconstructions  in  noise  for  any  A. 

5.  Conclusion 

The  numerical  performance  of  the  2-D  layer  stripping  algorithm  of  [4]  has  been  studied 
for  the  first  time.  This  represents  the  first  numerical  implementation  of  an  “exact”  non¬ 
iterative  inverse  scattering  algorithm  that  includes  the  effects  of  all  multiple  scattering 
and  diffraction  effects.  The  forward  scattering  data  were  generated  using  the  invariant 
imbedding  algorithm  of  [5].  The  results  indicated  that  layer  stripping  is  a  viable  technique 
for  solving  2-D  Schrodinger  equation  inverse  potential  problems,  for  which  two  applications 
were  briefly  reviewed. 

Two  partictilarly  important  results  were  that:  (1)  the  “exact”  reconstructions  using 
the  layer  stripping  algorithm  are  a  noticeable  improvement  over  the  Bom  approximation 
reconstructions;  and  (2)  small  amoimts  of  additive  noise  in  the  reflection  response  do  not 
cause  numerical  instability  in  the  layer  stripping  algorithm.  The  results  were  illustrated 
using  several  numerical  examples.  It  was  also  shown  for  the  first  time  that  the  Born 
approximation  to  the  layer  stripping  algorithm  reconstructs  the  scattering  potential  from 
the  reflection  response  generated  by  the  Bom  approximation  to  the  invarieint  imbedding 
algorithm  of  [5]. 
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Abstract.  We  reformulate  the  2-D  Schrodinger  equation  inverse  scattering  problem 
as  a  miiltichjinnel  two-component  wave  system  by  Fourier  tremsforming  the  Schrodinger 
equation  in  the  lateral  spatial  variable.  Discretization  restilts  in  new  2-D  layer  strip¬ 
ping  algorithms,  which  incorporate  multichannel  transmission  effects;  this  also  leads  to  an 
important  new  feasibility  condition  on  impulse  reflection  response  data.  A  2-D  discrete 
Schrodinger  equation  is  deflned,  and  analogous  restilts  are  obtained.  Numerical  examples 
illustrate  the  new  results. 
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1.  Introduction. 

1.1  Applications  and  Previous  Approaches.  The  inverse  scattering  problem  for 
the  Schrodinger  equation  in  two  dimensions  with  a  time-independent,  local,  non-circularly 
symmetric  potential  has  many  apphcations.  These  apphcations  include:  (1)  reconstruction 
of  a  three-dimensional  (3-D)  acoustic  medium  with  density  and  wave  speed  varying  in  two 
dimensions  (2-D),  from  surface  measurement  of  the  steaidy-state  medium  displacement 
response  to  a  harmonic  hne  source  [1];  amd  (2)  reconstruction  of  a  3-D  electrical  medium 
with  resistivity  varying  in  2-D  from  surface  measurement  of  the  potential  resulting  from  a 
hne  DC  current  source  [2]. 

Two  major  approaches  for  exact  solution  of  the  2-D  Schrodinger  equation  inverse 
potential  problem  have  been  proposed.  The  first  is  the  2-D  version  of  the  Gel’fand-Levitan 
and  Mairchenko  integral  equation  methods  [3].  The  other  is  the  2-D  version  of  the  layer 
stripping  differential  methods  [4].  Here  “exact”  means  that  all  diffraction  and  mvdtiple 
scattering  effects  are  included  in  the  mathematical  solution;  errors  in  the  solution  will 
arise  solely  due  to  purely  numerical  effects  such  as  discretization  and  roundoff.  Hence 
all  methods  based  on  the  Bom  (single-scattering)  approximation  are  excluded  here,  since 
such  methods,  and  their  modifications,  do  not  take  into  accoimt  all  multiple  scattering 
effects.  No  numerical  implementation  of  the  integral  equation  methods  of  [3]  has  been 
reported.  The  2-D  layer  stripping  algorithm  proposed  in  [4]  was  numerically  implemented 
successfully  in  [5],  and  shown  to  be  robust  in  the  presence  of  small  noise  levels. 

1.2  New  Results.  This  paper  proposes  two  new  2-D  layer  stripping  algorithms  differ¬ 
ent  from  those  of  [4]  eind  [5].  The  new  algorithms  have  the  form  of  a  multichannel  coupled 
two- component  wave  system  in  which  downgoing  and  upgoing  wave  matrices  are  scattered 
into  each  other,  and  where  each  channel  corresponds  to  a  different  lateral  wavenumber. 
This  form  is  somewhat  reminiscent  of  the  wave  systems  of  [6]  et  seq.,  in  which  downgoing 
and  upgoing  waves  are  again  defined  for  2-D  scattering  medaa.  However,  the  wave  system 
of  this  paper  differs  from  that  of  [6]  in  three  important  ways:  (1)  it  is  derived  for  the 
Schrodinger  equaticm  inverse  potential  problem,  instead  of  the  wave  eqiiation  or  Maxwell’s 
equations,  as  in  [6];  (2)  it  is  the  basis  for  a  computationally  efficient  layer  stripping  t3q)e  al¬ 
gorithm,  instead  of  a  much  more  computationally  intensive  invariant  imbedding  algorithm. 


as  in  [6];  and  (3)  it  is  much  simpler  in  form  than  the  wave  system  proposed  in  [6]. 

The  last  point  is  particularly  important,  as  it  leads  to  a  central  result  of  this  paper, 
which  is  a  feasibihty  condition  on  the  impulse  reflection  data.  Speciflcally,  we  show  that 
the  lateral  Fourier  transform  of  the  free-surface  (perfectly  reflecting)  impulse  reflection 
response  input  must  be  a  2-D  positive  definite  function  in  time  5ind  lateral  wavenumber. 
This  generalizes  a  famous  result  of  Kunetz  [7]  to  the  2-D  case.  While  the  Kimetz  result 
was  generalized  to  the  multichannel  case  in  [8],  this  is  the  first  result  for  the  2-D  inverse 
potential  problem  for  the  Schrodinger  equation.  Although  our  result  is  derived  using  a 
discrete  zirgument,  it  holds  for  arbitrarily  small  discretizations. 

The  significance  of  this  result  is  that  additive  noise  in  the  data  may  destroy  this 
positive  definite  property,  and  this  will  render  the  layer  stripping  algorithms  numerically 
unstable.  Moreover,  the  reputation  for  numerical  instability  of  layer  stripping  algorithms 
in  noise  is  due  to  infeasible  data:  if  the  data  are  not  positive  definite,  then  there  is  no 
scattering  potential  that  could  have  produced  this  reflection  response.  We  show  that 
filtering  noisy  data  to  produce  feasible  data  results  in  a  numerically  stable  reconstruction. 

1.3  Organization.  This  paper  is  organized  as  follows.  In  Section  2  we  quickly 
review  results  for  the  1-D  Schrodinger  equation  inverse  scattering  problem.  This  includes 
the  Miura  transform  between  the  Schrodinger  equation  and  two-component  wave  systems, 
discussion  of  discretization  vs.  discrete  media,  and  the  Kimetz  result  [7]  and  its  sigmficance. 
In  Section  3  we  generalize  the  results  of  Section  2  to  the  2-D  case  by  first  taking  a  Fourier 
transform  in  the  lateral  spatial  variable.  This  results  in  two  new  types  of  layer  stripping 
algorithms,  which  in  turn  solve  discrete  counterpeirts  to  integral  equations.  This  is  the  first 
explicit  demonstration  of  the  direct  link  between  differential  and  integral  equation  methods 
for  2-D  inverse  scattering;  it  generalizes  the  results  of  [9], [10]  to  the  multidimensional  case. 
Section  3  also  generalizes  the  Kunetz  result  [7]  to  the  2-D  inverse  scattering  problem. 
Section  4  produces  similar  results  for  the  2-D  inverse  scattering  problem  in  which  the 
excitation  is  a  point  source,  rather  than  an  infimte  plane  wave.  Results  for  this  problem 
are  simpler  than  those  of  Section  3.  Section  5  presents  some  numerical  examples  which 
illustrate:  (1)  the  numerical  performance  of  the  new  algorithm;  (2)  the  sigmficance  of 
multiple  scattering,  by  comparison  with  results  using  the  Bom  approximation;  (3)  how 


additive  noise  can  result  in  infeasible  data;  and  (4)  how  the  infeasible  data  can  be  projected 
onto  the  subspace  of  feasible  data,  which  results  in  a  numerically  stable  reconstruction. 
Section  6  concludes  with  a  summary  and  some  suggestions  for  futtire  research. 


2.  Review  of  1-D  Results.  We  quickly  review  some  pertinent  results  for  the 
continuous  and  discrete  1-D  inverse  scattering  problems.  These  include  continuous  and 
discrete  Schrodinger  equations,  Miura  transforms,  two-component  wave  systems,  and  the 
Krein  integral  equation.  We  also  review  the  layer  stripping/ Schur  algorithm  and  Kunetz 
result  for  the  discrete  problem.  Section  3  will  generalize  all  of  these  results  to  2-D. 

2.1.  Continuous  Problems.  The  1-D  Schrodinger  equation  inverse  scattering 
problem  is  defined  as  follows.  A  wave  field  u{x,  k),  where  x  denotes  depth  and  k  denotes 
frequency  (for  unit  wave  speed)  or  wavenumber,  satisfies  the  Schrodinger  equation 

<P 


( 


dx'^ 


+  k^-  V(x) 


l)  u(x,k)  =  0, 


(2.1) 
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where  R{k)  is  known.  The  goal  is  to  reconstruct  the  scattering  potential  {V(x),x  >  0} 
from  the  Fourier  transform  R{k)  of  the  impulse  reflection  response.  This  formulation 
was  first  applied  to  the  inverse  scattering  problem  of  reconstructing  a  continuous  layered 
acoustic  medium  from  its  impulse  reflection  response  in  [11];  there  have  been  many  other 

applications  since  then. 

Now  define  the  two-compKinent  wave  system 

/  \  n  r  i-k/  -  T-\  1 

(2.3) 


d 

D(x,k)' 

—ik 

-r(x)  ■ 

D{x,k) 

dx 

U(x,k) 

-r(x) 

ik 

Uix,k) 

where  D(x,  k)  and  U{x,  k)  are  Fourier  transforms  of  downgoing  and  upgoing  waves,  re¬ 
spectively,  and  r(x)  is  the  reflectivity  function.  The  boundary  conditions  for  (2.3)  are 

U(x,  k)  =  R(Jb)c'**,  X  <  0  (2.4<i) 


D{x,  k)  =  e"'**;  Uix,  k)  =  R(fc)c'**,  x  <  0 
D(x,  k)  =  T(jb)e-‘**;  Uix,  k)  =  0,  x  ^  oo. 


(2.46) 
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The  inverse  scattering  problem  for  the  wave  system  (2.3)  is  to  reconstruct  the  reflectivity 
fimction  {V{x),x  >  0}  from  R{k).  Agcun,  this  formulation  has  been  apphed  to  the  inverse 
scattering  problem  of  reconstructing  a  continuous  layered  acoustic  medium  from  its  impulse 
reflection  response  (see  the  reference  hst  of  [9]). 

The  Schrodinger  equation  (2.1)  can  be  transformed  into  the  two-component  wave 
system  (2.3)  using  the  Miura  transform  [9].  Given  the  potential  V^(x),  define  the  reflectivity 
function  r(i)  as  the  solution  to 

V{x)  =  r'^(x)  —  dr{x)/dx-,  u(x,k)  =  D(x,k)  +  U{x,k),  (2.5) 

where  r(0)  is  assumed  to  be  known  (since  x  =  0  denotes  the  surface  of  the  scattering 
medium).  Then  the  inverse  scattering  problem  defined  by  (2.1)-(2.2)  is  equivalent  to  the 
inverse  scattering  problem  defined  by  (2.3)-(2.4). 

2.2.  Derivation  of  Layer  Stripping  Algorithm.  We  review  the  derivation  of 
the  layer  stripping  algorithm  for  the  two-component  wave  system  inverse  scattering  prob¬ 
lem  (2.3)-(2.4).  The  point  here  is  to  compare  the  discretized  continuous  layer  stripping 
algorithm  with  the  Schur  algorithm  in  Section  2.3. 

In  the  time  domzdn  (2.3)  becomes  the  pair  of  equations 

+  =  (2-6a) 

(^  -  ^)  l?(*.  ()  =  -r{x)D(i,  t)  (2.66) 

where  D{x,t)  =  f^^D{x,k)e'^*dk  is  the  inverse  Fourier  transform  of  D{x,k),  and  simi¬ 
larly  for  U{x,t).  D{x,t)  and  U{x,t)  are  clearly  waves,  since  (2.6)  describe  quEmtities  that 
propagate  in  increasing  iind  decreasing  depth  x  as  t  increases.  The  reflectivity  function 
r(x)  describes  how  much  of  ea<di  wave  is  reflected  into  the  other  wave  at  each  depth  x. 

Since  the  impulse  reflection  resp>onse  R{t)  is  causal,  it  is  clear  that  I){x,  t)  and  U(x,t) 
have  the  forms 


D{x,  t)  =  S(t  —  x)  +  D(x,  t)l(t  —  x) 

(2.70) 

Uix,t)  =  U{x,t)l{t-x) 

(2.76) 
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where  b{x,  t)  eind  U{x,  t)  are  the  smooth  parts  of  b{x,  t)  and  U{x,  t)  (both  of  which  jump 
at  t  =  x),  and  where  !(•)  is  the  unit  step  or  Heaviside  fimction.  Inserting  (2.7)  into  (2.6) 
and  using  a  propagation  of  singularities  argument  (this  8imoimts  to  equating  coefficients 
of  6{t  —  x))  yields 

b(x,  t)  =  -r(^)Uix,t)  (2.8a) 


^  b(x,t)  =  -r{x)U{x,t) 

^  U{x,t)  =  -r(i).D(r,t) 

r(x)=2U{x,x-^). 


(2.86) 

(2.8c) 


We  now  discretize  depth  x  =  mA  and  time  t  =  nA  to  integer  multiples  of  some 
discretization  length  A.  Since  the  wave  speed  is  unity,  depth  and  time  have  the  same  A. 
Using  forward  differences,  (2.8)  discretizes  to 


b(x  +  A^t)  _  1 

U(r  +  A,t)  ~  -r(x)A 


-r(x)A  b(x,t 
1  U{x,t 


-A) 
+  A) 


(2.9a) 


r(x)A  =  U(x,  x)/bix,  X  -  2A)  (2.96) 

D(0,t)  =  l;  UiO,t)=  r  R{k)e^'‘Uk.  (2.9c) 

-/— OO 

Equations  (2.9)  are  the  layer  stripping  algorithm  for  solving  the  inverse  scattering  problem 
specified  by  (2.3)-(2.4).  Note  that  the  factor  of  2  in  (2.8c)  disappears  in  (2.9b),  which 
follows  from  (2.9a)  by  setting  f  =  x  and  noting  that  U’(x  +  A,x  —  A)  =  0  by  caiisality. 
Once  r(x)  has  been  reconstructed,  the  Schrodinger  equation  inverse  scattering  problem 
(2.1)-(2.2)  is  solved  by  computing  V{x)  &om  r(x)  using  (2.5).  Note  that  the  differential 
equation  (2.5)  need  never  be  soived-(2.5)  is  a  formula  for  computing  V'(x)  from  r(x). 

2.3.  Discrete  Problems.  The  discrete  1-D  Schrodinger  equation  inverse  scattering 
problem  is  defined  as  follows.  A  wave  field  u(t,  j  ),  where  i  and  j  are  depth  and  time  indices, 
satisfies  the  discrete  Schrodinger  equation 


u(i  +  1,  j)  +  u{i  -  l,j)  -  u{i,j  +  1)  -  u(»,  j  -  1)  =  V{i)u{i  -  l,j), 


(2.10) 


edong  with  the  bomdary  conditions 


-  I 


- :)  +  R{j  +  i),  if  t  <  0; 
—  i).  if :  — >  oc 


(2.11) 
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The  discrete  inverse  scattering  problem  is  to  reconstruct  V’(i)  from  R{j)-  This  problem 
has  been  formulated  and  discussed  in  [12]  and  [13].  Note  that  the  left  side  of  (2.10)  can 
be  viewed  as  a  discrete  form  of  the  inverse  Fourier  transform  of  the  left  side  of  (2.1).  If 
the  indices  are  scaled  to  be  multiples  of  A,  then  (2.10)  approaches  the  inverse  Founer 
transform  of  (2.1)  as  A  0. 

The  discrete  two-component  wave  system  is  defined  as 


'D(i  +  l,j)]  1  1 

U(i  +  lJ)\  _  r(t)2  -KO  1  \  [U(i,j  +  l)\' 


(2.12) 


where  D(i,j)  and  U(iJ)  are  the  downgoing  and  upgoing  waves  just  below  the  i*'‘  interface 
at  time  j.  Note  the  time  shifts  required  by  the  unit  time  required  to  propagate  through 
each  layer  of  unit  thickness  at  iinit  wave  speed.  The  boundary  conditions  are 


D(iJ)  =  ^{j  -  Oi  ^ihj)  =  ~  *)»  *  ^  0  (2.13a) 


D{i,j)  =  Tij-iy,  Uiij)  =  0,  i^oo.  (2.136) 

The  discrete  inverse  scattering  problem  is  to  reconstruct  r(t)  from  R{j).  This  problem 
has  been  Einalyzed  in  deteiil  in  [10]  and  earlier  references  listed  in  [10];  r{i)  is  the  interface 
reflection  coefficient. 

The  discrete  Schrodinger  equation  (2.10)  can  be  transformed  into  the  two-component 
wave  system  (2.12)  using  a  discrete  form  of  the  Miura  transform.  Given  the  potential  V{i), 
define  the  reflection  coefficient  r(t)  as  the  solution  to  (compare  to  (2.5)) 

y(i)  =  r(0r(»  -  1)  -  (r(t)  -  r{i  -  1));  u{i  -  1,;)  =  D{i,j  -  1)  +  U{i,j  +  1),  (2.14) 

where  r(0)  is  assumed  to  be  known  (since  t  =  0  denotes  the  surface  of  the  scattering 
medium).  Then  the  inverse  scattering  problem  defined  by  (2.10)-(2.11)  is  equivalent  to  the 
inverse  scattering  problem  defined  by  (2.12)-(2.13).  The  transformation  (2.14)  was  first 
employed  in  [14]  to  derive  the  so-called  split  Levinson  and  Schur  algorithms.  Its  application 
to  inverse  scattering  as  a  discrete  form  of  the  Miura  transform  was  first  noted  in  [15].  Note 
again  that  (2.14)  is  a  formula  for  computing  V{i)  from  r(i),  not  a  difference  equation. 
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2.4.  Schur  Algorithm.  The  discrete  inverse  scattering  problem  defined  by  (2.12)- 
(2.13)  can  be  solved  very  easily  using  the  Schur  edgorithm  [9], [10],  a  famous  signal  pro¬ 
cessing  algorithm  in  the  solution  of  Toeplitz  systems  of  equations.  The  Schur  algorithm 
is  identical  to  the  layer  stripping  algorithm  (2.9),  with  A  =  1  and  independent  variables 
(x,t)  replaced  with  (ij).  Note  that  D(iJ)  and  U{i,j)  are  zero  for  odd  values  of  i+j.  The 
reason  for  this  is  physically  quite  clear:  since  waves  require  two  time  iinits  to  propagate 
through  a  layer  and  return,  waves  can  arrive  at  a  given  interface  only  for  alternate  values 
of  time  j. 

The  major  point  here,  whose  importance  cannot  be  overemphasized,  is  that  the  success 
of  the  layer  stripping  algorithm  is  due  entirely  to  its  equivalence  to  the  Schur  algorithm. 
The  algorithm  (2.9),  which  is  ostensibly  a  discretization  of  continuous  equations  for  solving 
a  continuous  problem,  is  in  fact  solving  the  explicitly  discrete  problem  (2.12)-(2.13)  exactly, 
without  approximation.  Furthermore,  the  Schur  algorithm  has  excellent  niimerical  stability 
properties  because  it  reconstructs  exactly  the  lossless  medium  described  by  (2.12). 

The  two-component  wave  system  (2.12)  is  lossless  and  well-behaved  numerically  as 
long  as  lr(n)|  <  1  [10].  The  necessity  of  |r(n)|  <  1  is  clear  from  the  transmission  loss 
factor  -^1  —  r(iy  in  (2.12);  sufficiency  follows  from  the  fact  that  (2.12)  can  be  resirranged 
into  the  scattering  form 


D{i  -b  1,;) 

■  v/1  -  ’■(•F 

-r(i) 

KO 

^1  -r(t)2 

> 

in  which  the  single-layer  scattering  matrix  is  orthogonal.  Note  that  for  discrete  lossy 
transmission  lines  and  discrete  layered  acoustic  media,  the  interface  reflection  coefficient 
is  r(i)  =  {Z{i  -hi)  -  Z{i))/{Z{i  -b  1)  -f-  Z{i)),  where  Z{i)  is  the  impedance  of  the  i**  layer, 
so  that  |r(n)|  <  1  is  guaranteed. 

2.5.  Kunetz  Condition.  Now  suppose  that  the  top  interface  is  a  free  surface 
(perfect  reflector),  and  that  the  probing  impulse  in  (2.13a)  is  introduced  just  below  this 
surface  (so  that  now  D{i,i)  =  S{j  -  i)  +  R{j  -  »))•  In  tins  case,  the  Schur  algorithm 


computes  the  reflection  coefl&cients  associated  with  the  Toephtz  system  of  equations 

r  1  i2(l) 

Ril)  1 
.R{n)  Rin-1) 

where  the  R{j)  are  the  impulse  reflection  response  deflned  in  (2.13a),  the  and 

are  elements  of  the  matrix  Green’s  function  of  (2.12)  (see  [10])  and  =  fl  ~  ^(0^ 

is  the  ismission  loss  through  n  layers.  The  reflection  coeflBcients  can  be  recovered  by 
r(n)  =  Gl  - 

Eq.  (2.16)  can  be  interpreted  in  two  different  ways:  (1)  as  a  discrete  coimterpart  of 
the  Krein  integral  equation  (see  [9], [10],  or  scale  i  and  j  by  A  and  let  A  *  0);  or  (2)  as 
a  set  of  Ytile- Walker  equations  for  computing  the  least-squares  autoregressive  prediction 
Alter  coefficients  of  a  discrete-time  zero-mean  stationary  random  process  with  unit  variance 
and  covariance  lags  iZ(t).  The  transmission  loss  t„  through  n  layers  in  (2.16)  is  analogous 
to  the  rms  prediction  error  of  an  n*^-order  filter. 

Since  the  covariance  lags  of  a  stationary  random  process  must  form  a  positive  definite 
sequence,  the  following  result,  attributed  to  Kunetz  [7],  is  not  surprising.  Let  R{n)  be 
the  impulse  response  of  a  discrete  lossless  layered  medium  with  a  free  surface.  Then  the 
two-sided  sequence  {. . .  i2(2), i2(l),  1, i2(2) . . .}  is  positive  defimte.  In  fact,  it  is  the 
autocorrelation  of  the  transmission  response  T(j)  defined  in  (2.13b).  This  result  is  derived 
algebraically  in  [7].  Furthermore,  a  well-known  result  in  linear  prediction  theory  [16]  states 
that  the  reflection  coefficients  have  the  property  |r(n)|  <  1  if  and  only  if  the  system  matrix 
(2.16)  is  positive  definite,  i.e.,  the  R{j)  and  their  time  reversal  form  a  positive  definite 

sequence. 

The  significance  of  this  result  in  applying  the  layer  stripping/Schur  algorithm  (2.9) 
is  as  follows.  If  the  impulse  reflection  response  data  are  corrupted  by  noise  so  that  they 
no  longer  constitute  a  positive  definite  sequence,  then  the  algorithm  will  fail.  This  is 
appropriate,  since  such  data  are  infeasible  in  that  there  is  no  lossless  medimn  that  could 
give  rise  to  such  data.  This  is  why  layer  stripping  algorithms  have  the  reputation  of  being 
unstable  in  noise:  they  are  being  fed  infeasible  data!  If  the  noisy  data  are  processed  so 
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that  it  forms  a  positive  definite  (although  still  noisy)  sequence,  the  algorithm  will  behave 
well  numericedly.  Of  course,  it  will  reconstruct  the  medium  associated  with  the  noisy  data, 
not  the  actual  medium,  but  it  will  not  diverge. 

In  the  next  section  we  generahze  all  of  these  resvdts  to  the  2-D  case. 

3.  New  2-D  Results  for  Plane- Wave  Excitation. 

3.1.  Continuous  Problem.  The  2-D  Schrodinger  equation  inverse  scattering  prob¬ 
lem  is  defined  as  follows.  The  problem  is  defined  in  (x,  z)  space,  where  x  is  lateral  position 
and  2  is  depth,  increasing  downward  from  the  surface  z  =  0.  The  wave  field  u(x,z,k) 
satisfies  the  2-D  Schrodinger  equation 

^ u(x,z,k)  =0,  (3.1) 

where  the  potential  V(x,  z)  is  real- valued  and  smooth.  It  is  also  assumed  that  V{x,  z)  does 
not  induce  boimd  states;  a  sufficient  condition  for  this  is  for  V(x,z)  to  be  non-negative. 

In  separate  experiments,  the  medium  is  probed  at  various  angles  of  incidence  6  to 
the  vertical  by  an  impulsive  pleme  wave  6{t  —  x  sin  8  —  z  cos  6)  which  at  the  point  x  =  0 
passes  through  the  surface  z  =  0  at  time  t  =  0  and  induces  scattering  due  to  V^(x,z)  for 
t  >  0.  A  Fourier  transform  taking  time  t  into  frequency  u  resiilts  in  e  ^  as  the 

excitation.  Since  each  experiment  is  performed  separately,  we  may  define  the  wavenumbers 
jfc  =  a;  cos  0  and  fc,  =  w  sin^;  the  excitation  for  each  experiment  then  becomes 
Note  that  for  normal  incidence  ^  =  0  this  reduces  to  e"’**  (compare  to  (2.2)).  Note  also 
that  the  effective  medium  wave  speed  co  defined  by  k  =  w/co  vBmes  for  each  experiment, 
but  is  constant  for  a  given  experiment. 

The  data  are  the  reflection  responses  R{k,x]  fc*)c‘**  in  the  direction  of  decreasing  z 
resulting  from  the  plane  waves  c“***c“‘***  (compare  to  (2.2)).  Note  this  is  the  backscat- 
tered  field  for  all  x  in  the  infinite  half  space  z  <  0.  The  goal  is  to  reconstruct  the  potential 
y(x,z)  from  the  reflection  responses  R(k,x;kx).  In  fact,  Section  4  will  show  that  the 
reflection  response  to  a  single  impulsive  point  source  is  sufficient,  but  this  more  general 
formulation  is  also  of  inter^t. 

3.2.  Applications.  We  now  quickly  review  two  applications  of  this  problem.  First, 


m 


# 
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consider  the  problem  of  reconstructing  a  3-D  inhomogeneous  acoustic  medium  whose  den¬ 
sity  p(x,z)  and  wave  speed  c(x,z)  axe  smooth  fvmctions  of  depth  z  and  lateral  position  x. 
The  medium  is  bounded  by  a  free  (pressure-release)  surface  2  =  0.  The  density  po  and 
wave  speed  cq  for  2  <  0  eind  2  — ♦  00  are  known.  The  medium  is  probed  with  cylindrical 
harmonic  waves,  at  two  frequencies  ui  and  wj?  from  a  harmonic  line  sovirce  extending 
along  the  x-axis,  and  the  sinusoidal  steady-state  vertical  acceleration  a(x,y,z  =  0;w,)  of 
the  medium  at  the  free  siuface  2  =  0  is  measured.  The  goal  is  to  reconstruct  p{x,  2)  and 
c(x,z)  from  the  measurements  2  =  =  1,2. 

This  problem  can  be  formulated  as  a  2-D  Schrodinger  equation  inverse  potential  prob¬ 
lem  by  Fourier  transforming  the  basic  acoustic  equations  with  respect  to  time  and  the  other 
lateral  variable  y.  Details  are  given  in  both  [1]  and  [4].  Here  we  merely  note  that  in  the 
Schrodinger  equation  (2.1)  the  wave  field  u(x,z,k)  is  pressure  divided  by  p(x,zY^^,  the 
waveniunber  P  =  Ui/cl  —  and  the  potential  V(x,  z\uji)  is 


nx,.;w,)=  (f)  (3.2) 

It  is  clear  that  performing  this  experiment  for  two  different  frequencies  =  1,2  will 
allow  p{x,z)  and  c{x,z)  to  be  computed  from  (3.2).  The  wave  field  is  zero  at  the  free 
svuface  2  =  0;  its  gradient  is  the  medium  acceleration  />(i,0)^/^a(i,y,2  =  0;u;,),i  =  1,2. 

The  second  application  is  the  inverse  resistivity  problem  of  reconstructing  a  3-D  in¬ 
homogeneous  electrical  medium  whose  resistivity  p{x,z)  is  a  smooth  function  of  x  and  2 
over  a  boimded  region.  The  medium  is  probed  with  current  from  a  line  DC  current  somce 
extending  along  the  x-axis,  and  the  electrical  potential  t;(x,  y,  2  =  0)  induced  on  the  sur¬ 


face  2  =  0,  assumed  to  be  a  perfect  insulator,  is  measured.  The  gosd  is  to  reconstruct  the 
resistivity  p{x,z)  from  the  meastirements  of  electrical  potential  v(x,y,z  =  0).  Note  that 
for  both  applications,  the  response  to  a  line  source  may  be  found  by  superposition  of  the 
responses  due  to  point  sources  along  the  x-axis. 

This  problem  can  be  formulated  as  a  2-D  Schrodinger  equation  inverse  potential 
problem  by  Fourier  transforming  Ohm’s  and  Kirchoff’s  current  laws  with  respect  to  the 
other  lateral  variable  y.  Details  are  given  in  [2].  Here  we  merely  note  that  in  the 
Schrodinger  equation  (3.1)  the  wave  field  u{x,z,k)  is  now  the  inverse  Laplace  transform 
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of  the  Fourier  transform  of  electrical  potential  divided  by  and  the  scattering 

potential  V(x,z)  =  p(x,  zy/^V^(p(x, 

3.3.  Reformulation  of  (3.1)  as  a  Two-Component  Wave  System.  Taking  the 
Fourier  transform  of  (3.1)  in  the  lateral  spatial  variable  x  yields 

(^■^  +  -  kl^  u{z,k,kr)  =  J  V{z,ki  -  k'^)u(z,k,k'^)dk'^,  (3.3) 


where 


/oo  1  roo 

i(x,z,k)e-“'-‘dx-,  V(x,x)e-‘'‘-‘dx. 


(3.4) 


The  multiplication  by  kl  in  (3.3)  will  induce  numericzil  instability.  Hence  we  replace  kl 
with  F{kl),  where  F(kl)  is  a  function  that  windows  kl  to  zero  for  large  One  possible 
choice  of  F(kl)  is 

Fikl)  =  I  ^  (3.5) 

^  \  0,  otherwise 

for  some  cutoff  wavenumber  K.  This  is  reminiscent  of  the  clipped  filter  used  in  the  filtered 
back-projection  procedure  for  inverting  the  Radon  transform. 

Replacing  kl  with  F(kl),  (3,3)  may  be  rewritten  as 


+ 1’")  -  K) + Viz,  k.  -  k'.))  u(z,  k,  A:;)*;.  (3.6) 

To  clarify  the  2-D  form  of  the  Miura  transform  (2.5),  we  now  discretize  k^  into  integer 
multiples  of  an  arbitrarily  smsJl  A.  This  results  in 

(^  +  t"-v)u(z,/fc)  =  0,  (3.7) 

where  u(z,  Jb)  is  a  matrix  whose  (m,  n)**  element  is  u(z,  k,  mA)  associated  with  the  exper¬ 
iment  with  excitation  and  V  is  a  Toeplitz-plus-diagonal  matrix  with  (i,i)** 

element 

=  F(iiA)^)6{i  -  j)  +  V{z,  {i  -  j)A)A.  (3.8) 

Note  that  the  indices  in  (3.8)  run  from  -N  to  N,  where  iV  is  an  arbitrarily  large  integer, 
and  that  V  is  Hermitian,  since  V(x,z)  is  real. 
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Now  define  the  matrix  refiectivity  function  r  by  (compare  to  (2.5)) 


V  =  r^—dr/dz;  u{2,k)  =  {D +  lJ)(z,k),  (3.9) 

where  r(0)  is  assumed  to  be  known  (since  z  =  0  denotes  the  surface  of  the  scattering 
medium).  Then  (3.7)  can  be  transformed  into  the  multichannel  two-component  wave 
system  (compare  to  (2.3)) 


d 

'0(2, i)' 

■  -iki 

-r(z)' 

■D(z,jfc) 

dz 

U(z,i). 

_-r(z) 

ikI 

.U(z,fc) 

in  which  D(z,  k),  U(2,  Jb),  r(z),  and  K(k)  (see  (3.11))  are  all  N"  x  iV  matrices.  For  each  of 
these  matrices,  different  columns  correspond  to  different  experiments,  while  different  rows 
correspond  to  different  channels  for  a  given  experiment. 

The  botmdary  conditions  for  (3.10)  are  (compare  to  (2.4)) 

D(z,jfc)  =  U(z,fc)  =  R(ik)e'**,  z<0  (3.11a) 

D(z,  lb)  =  T(ifc)e-’‘^;  U(z,ifc)  =  0,  z oo  (3.116) 

/oo 

R(k,i-,nA)e-’"“‘Adx  (3.11c) 

■OO 

where  i2(ib,  a;;  ifci)e‘**  is  the  reflection  response  to  The  Fourier  transform  of 

Jg  gQ  eacii  excitation  e“'**e“‘*'*  for  (3.1)  excites  a  single 

channel  in  (3.10).  So  the  (m,n)*'‘  element  of  D(z,fc)  or  U(z,fc)  is  the  downgoing  or 
upgoing  wave  at  depth  z  in  the  m**  channel,  resulting  from  a  downgoing  impulse  e“’**  in 
the  channel  only. 

Hence  the  2-D  Schrodinger  equation  inverse  scattering  problem  (3.1)  can  be  trans¬ 
formed  into  the  multichannel  two-component  wave  system  inverse  scattering  problem 
(3.10)-(3.11),  by  way  of  the  multichannel  Schrodinger  equation  (3.7).  Once  r(z)  has  been 
reconstructed,  the  Schrodinger  equation  inverse  scattering  problem  (3.1)  is  solved  by  com¬ 
puting  V(z)  from  r(z)  using  (3.9).  Note  the  differential  equation  (3.9)  need  not  be  solved. 

3.4.  Discretization.  The  multichannel  two-component  wave  system  (3.10)  has  the 
same  form  as  the  scalar  two-component  wave  system  (2.3).  The  development  in  Section 
2.2  generalizes  directly  to  the  multichannel  case;  the  layer  stripping  algorithm  for  the 


multichanBel  two-component  wave  system  (3.10),  and  hence  for  the  Schrodinger  equation 


(3.1),  is  (compare  to  (2.9)) 


D(2  +  A,t) 

1 

-r(z)A' 

■D(z,/- A) 

V(z  +  A,t) 

-r(z)A 

1 

_U(z,t  + A) 

r(z)A  =  U(x, i)D  ^(x,x  — 2A) 


D(0,t) 


/;  U(0,t) 


(3.12a) 

(3.126) 

(3.12c) 


where  A  is  discretization  length  for  depth  z  and  time  t,  as  in  (2.9). 

It  is  worth  examining  the  above  discretizations  from  a  digital  signal  processing  per¬ 
spective.  We  now  recall  some  points  made  in  [5]  about  actual  numerical  implementation 
of  discretized  equations  such  as  (3.12).  Let  Ar  be  the  discretization  length  for  z  and  t  in 
(3.12),  and  let  Ajb  be  the  discretization  length  for  fc*  in  (3.7).  Note  that  A  and  At  have 
reciprocal  imits.  First,  note  that  A*  should  be  much  less  them  hsdf  the  reciprocal  of  the 
total  lateral  extent  of  interest,  to  avoid  aliasing.  For  example,  if  the  potential  has  finite 
support  -Li/2  <  X  <  Lxl2  in  x,  L*  would  be  the  lateral  extent  of  interest. 

Second,  note  that  the  discretized  fimctions  D(z  -f-  A,  i),  etc.  cannot  be  regarded  as 
sampled  versions  of  the  continuous  functions  D(z,0,  etc.,  even  if  sampling  is  performed 
above  the  Nyquist  rate,  since  the  convolution  in  (3.6)  (which  becomes  the  products  of 
Toephtz  matrices  with  columns  of  other  matrices  in  (3.7))  mixes  the  kg.  Even  if  the 
inverse  potential  problem  is  regularized  by  assuming  that  V^(z,  fcx)  is  bandhmited  in  z  and 
zero  for  \k,\  >  K  for  some  K,  it  is  clear  that  u{z,k,kr),  etc.  wiU  NOT  have  similar 
properties.  Imposing  a  bandlimited  in  k^  condition  at  each  recursion  will  lead  to  errors, 
since  the  missing  high  it*  will  cause  errors  for  low  due  to  the  mixing  of  the  fc*. 

This  leads  to  the  question  of  what  the  discretized  functions  mean,  and  how  the  con¬ 
volutions  in  kg  should  be  performed.  It  should  be  noted  that  similar  questions  arise  in 
integral  equation  methods.  One  possible  interpretation  is  to  i>erform  a  periodic  extension 
in  it  of  all  quantities.  The  period  in  k  should  be  1/A;  K  in  (3.5)  should  then  be  half  this. 
It  is  clear  by  induction  that  if  all  quantities  at  depth  z  =  nA  are  periodic  in  fc,  then  all 


quantities  at  depth  z  =  (n  -}-  1)A  will  also  be  periodic  in  k. 

This  creates  two  advantages:  (1)  the  infinite  linear  convolutions  becomes  finite  cyclic 
convolutions;  emd  (2)  the  discrete  Fourier  treinsform  may  be  used  to  perform  all  Fourier 


transforms.  In  terms  of  (3.7),  V(2)  is  now  a  circulant  matrix.  Since  periodicity  in  one 
Fourier  domain  is  equivalent  to  discreteness  in  the  other  Fourier  domain,  the  problem 
has  effectively  been  discretized  lateredly  as  well  as  vertically:  the  quantites  propagated  in 
(2.11)  are  not  samples  of  a  bandhmited  function,  but  actual  discrete  values.  As  A  — »  0, 
the  situation  approaches  the  continuous  problem. 

•  3.5.  Discrete  Problem.  Another  approach  is  to  formulate  and  solve  an  exphcitly 

discrete  2-D  Schrodinger  equation  inverse  scattering  problem.  Here  we  present  for  the  first 
time  such  a  problem,  generalizing  the  results  reviewed  in  Section  2.3. 

^  We  define  the  discrete  2-D  Schrodinger  equation  inverse  scattering  problem  as  follows. 

A  wave  field  u(t,j,n),  where  i  is  depth,  j  is  lateral  position,  and  n  is  time,  satisfies  the 
2-D  discrete  Schrodinger  equation 


u(i  4-  l,i,n)  -t-  u(i  -  l,j,n)  +  u(i  -  IJ  -I-  l,n)  +  ti(i  -  1,;  -  l,n)  -  2u(*  -  l,;,n) 


-u(t',j,n-f  1)  -  u(i,j,n  -  1)  =  V(i  -  lj)u{i  -  l,j,r»).  (3.13) 

It  is  clear  that  if  the  indices  in  (3.13)  axe  scaled  to  be  multiples  of  A  then  (3.13)  becomes 
the  time-domain  form  of  the  continuous  Schrodinger  equation  (3.1)  as  A  —*  0.  Note 
that  the  differences  corresponding  to  are  shifted  in  depth  z  from  the  differences 

corresponding  to  d^/dz^  and  d^fdt^.  This  is  necessary  to  obtain  the  multichemnel  two- 
component  wave  system  below. 

A  discrete-time  Fourier  transform  of  (3.13)  taking  j  into  kx  gives  (compare  to  (3.3)) 


u(t  -}-  l,n,kx)  -I-  u(t  -  l,n,fct)  -  u(t,n  -|-  l,i:*)  -  u(:,n  -  l,kx) 

OO 

=  (2  -  e'**  -  e“*‘')ti(»  -  l,n,A:i) -I-  ^  F(»  -  l,fc,  -  m)u(t  -  l,n,m) 

m=— OO 

u(i,n,kx)=  ^  =  ^  H 

j=—oo  j=—oo 

The  definitions  (3.14b)  should  be  compared  with  (3.4).  Note  that 

2  —  e‘**  —  e”'**  =  2(1  —  cos  kx)  «  kl,  fc*  — >  0 


(3.14a) 

(3.146) 


(3.15) 
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which  cilso  shows  how  the  discrete  2-D  Schrodinger  equation  (3.13)  reduces  to  the  contin¬ 
uous  2-D  Schrodinger  equation  (3.1).  Also  note  that  2(1  —  cos  kz)  can  be  regarded  as  a 
choice  of  F{kl)  different  from  (3.5). 

The  boundary  conditions  for  (3.14)  are  (compare  to  (3.11)) 


r  6(n  -  i)6{kz  -  k'^)  +  Rin  -H  i,  kz]  k'^),  if  i  <  0; 
\T(n-i,kz-,k'^),  ifi^oo. 


(3.16) 


In  (3.16)  R{n,kz\k'^)  is  the  reflection  response  to  an  impulsive  plane  wave  that  excites 
the  lateral  wavenumber  k'^  only.  This  could  be  measured  for  an  actual  medium  as  dis¬ 
cussed  earlier.  The  discrete  2-D  inverse  scattering  problem  is  to  reconstruct  l^(t,  ^i)  from 


R(n,kz;k'^). 

3.6.  Reformulation  of  (3.14)  as  a  Two-Component  Wave  System.  As  in 
Section  3.3,  we  discretize  kz  into  integer  multiples  of  A.  Eq.  (3.14)  can  then  be  rewritten 
as  (compare  to  (3.7)) 


u(,-  -f  1,  n)  -I-  u(i  -  1,  n)  -  u(t,  n  -I- 1)  -  u(t,  n  -  1)  =  V(t  -  l)u(*  -  1,  n)  (3.17) 


where  u(t,  n)  is  a  matrix  whose  (j,ky^  element  is  u(i,n,jA;  fcA)  and  V(j)  is  a  Toeplitz- 
plus-diagonal  matrix  with  (i,fc)‘*  element  (using  (3.15);  compare  to  (3.8)) 

V(i)(j, fc)  =  2(1  -  cos  (j  A))6(j  -k)  +  V{i,  {j  -  fc)A).  (3.18) 

Again  the  indices  in  (3.18)  run  from  —N  to  N,  where  N  is  an  arbitrarily  large  integer,  and 
V  is  Hermitian,  since  V'(t,j)  is  real. 

The  matrix  discrete  Schrodinger  equation  (3.18)  represents  the  discrete  2-D  Schrodinger 
equation  (3.13),  just  as  the  matrix  continuous  Schrodinger  equation  (3.7)  represents  the 
continuous  2-D  Schrodinger  equation  (3.1).  The  discrete  Miura  transform  (2.14)  general¬ 
izes  to  the  multichannel  case  as  follows  [17].  Given  the  matrix  potential  V(*)  defined  in 
(3.18),  define  the  matrix  reflection  coefficient  r(i)  as  the  solution  to 


V(i)  =  1-  T{i){I  -b  r(i))(J  -  r{i  -  1))T  ^(i)  (3.19a) 


T(*) = 

i=i 


(3.196) 
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(3.19c) 


u(t',  n)  =  T(0(/  -  r{i  -  l))-'(D(z-,  n)  +  U(t,  n)). 

Then  the  matrix  discrete  Schrodinger  equation  (3.18)  is  equivalent  to  the  discrete  multi¬ 
channel  two-component  wave  system  (compare  to  (2. 12), (3. 10)) 


D(z  +  l,j) 

r(J-r(.r)-‘/'  0  1 

I  -r(i) 

D(i,j  -  1) 

• 

.U(i  +  l,j). 

o 

1 

l-l 

. 

-r(i')  I 

(3 

As  in  the  continuous  system  (3.10),  the  bold  quantities  are  all  iV  x  A"  matrices.  For  each 
of  these  matrices,  different  columns  correspond  to  different  experiments,  while  different 
rows  correspond  to  different  channels  for  a  given  experiment.  And  the  inverse  scattering 
problem  defined  by  (3.14)-(3.16)  is  equivalent  to  the  inverse  scattering  problem  defined  by 
(3.20)  and  (3.16). 

Several  comments  are  in  order  here: 

1.  In  the  scalar  case  (3.19)  reduces  to  (2.14)  and  (3.20)  reduces  to  (2.12),  as  expected. 
However,  the  generalization  of  (2.14)  to  (3.19)  is  not  at  all  obvious,  due  to  the  matrix 
transmission  factor  T(t); 

2.  The  transmission  factor  T(t)  becomes  a  scalar  factor  in  (2.12)  which  cancels  out  in 
the  layer  stripping/Schur  algorithm  in  (2.9b).  This  is  why  the  1-D  layer  stripping 
algorithm  is  identical  to  the  Schur  algorithm-the  only  difference  does  not  make  a 
difference; 

3.  However,  the  transmission  factor  T(t)  does  make  a  difference  in  the  multichannel 
problem.  It  appears  both  in  (3.19)  and  as  the  matrix  factor  (/  —  r(z)^)  in  (3.20). 
It  does  not  appear  in  the  2-D  layer  stripping  algorithm  (3.12),  or  in  the  algorithms 
of  [4]  and  [5].  The  reason  is  that  since  r(t)  is  scaled  by  A  in  algorithms  obtained  by 
discretizing  continous  equations,  (/-r(»)^)“^^*  is  a  term  of  second  order  in  A,  so  that 
it  will  not  be  accounted  for  in  first  order  discretizations. 

3.7.  Solution  by  Schur  Algorithm  and  Block-Toeplitz  Systems  of  Equation. 
The  2-D  discrete  inverse  scattering  problem  defined  by  (3.14)-(3.16),  or  equivalently  (3.20), 
can  be  solved  very  easily  using  the  multichannel  form  of  the  Schur  algorithm  [18].  This 
algorithm  has  the  same  form  as  (3.12),  except  that  (3.12a)  should  be  replaced  by  (3.20)  in 
order  to  incorporate  the  factor  (J-r(»)^)“'/^,  which  produces  additional  coupling  between 
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chajinels.  Unlike  the  scalar  case,  in  which  the  Schur  algorithm  was  identical  to  the  layer 
stripping  algorithm  (2.9),  the  transmission  factor  does  not  cancel  out,  since  it  cannot  be 
pulled  out  as  an  overall  factor  multiplying  everything  (matrices  do  not  commute).  As 


before,  the  independent  variables  must  be  scaled  by  A. 

As  before,  the  multichannel  Schur  algorithm  should  be  expected  to  work  well,  since  it 
solves  exactly  an  explicitly  discrete  2-D  inverse  scattering  problem,  instead  of  being  merely 
a  discretization  of  continuous  equations.  The  discrete  matrix  two-component  system  (3.20) 
is  lossless  as  long  as  the  maximum  singular  value  of  r(t)  is  less  than  umty.  The  necessity  of 
this  should  be  clear  from  the  factor  (I  -  r(t)2)“^/^;  sufficiency  can  be  established  using  a 
scattering  argument,  as  before.  This  can  often  be  established  physically  for  media  described 
by  scattering  systems  of  the  form  (3.20).  For  example,  P  and  SV  wave  propagation  in  elastic 
media  can  be  put  into  the  form  (3.20),  where  the  matrices  are  all  2  x  2  [8];  the  matrix  of 
P-P,  P-SV,  and  SV-SV  interface  reflection  coefficients  can  be  shown  to  have  singular  values 
less  than  unity  (see  the  Appendix  of  [8]).  This  can  be  viewed  as  a  matrix  generalization  of 
the  scalar  acoustic  medium  result  r(*)  =  (Z(i  +  1)  ~  ■2'(t))/(‘^(*  +  1)  +  Z{i))  —*  |r(t)|  <  1. 

Now  suppose,  as  in  the  1-D  case,  that  the  top  interface  is  a  free  surface  (perfect 
reflector),  and  that  the  probing  impulse  in  (3.16)  is  introduced  just  below  this  surface. 
The  Schur  algorithm  computes  the  reflection  coefficients  associated  with  the  block-Toeplitz 
system  of  equations  [18]  (compare  to  (2.16)) 


r  I  R(l)  •••  R(n)  - 

r  F®  -  G"  1 

■  0  ■ 

R(l)  I  •••  R(n-l) 

_ 

0 

•  •  «  * 

•  •  *  • 

.R(n)  R(n-1)  •••  I  . 

.f;;-g®. 

.T(n). 

R(m)ij  =  R{m,iA;jA).  (3.216) 

Note  that  the  R(m)  are  themselves  matrices  defined  from  the  reflection  response  R(m,  kx]k'^) 
defined  in  (3.16).  Fj,  and  are  elements  of  the  matrix  Green’s  function  of  (3.20).  Re¬ 
flection  coefficients  r(n)  can.be  recovered  using  r(f»)  =  G®  —  F".  All  of  these  are  gen¬ 
eralizations  of  results  in  [10];  they  appeaured  in  [8]  for  the  case  of  2  x  2  matrices,  which 
generalize  directly  to  the  N  x  N  case. 

In  (3.21),  scale  all  indices  by  A  and  let  A  0.  The  result  is  a  2-D  generalization 
of  the  Krein  integral  equation.  Solution  of  this  integral  equation  will  solve  the  original 
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continuous  Schrodinger  equation  (3.1)  inverse  scattering  problem,  since  the  effects  of  all 
index  shifts  and  transmission  loss  disappear  in  the  continuous  limit  A  — >  0.  Although  we 
do  not  pursue  this  further  here,  this  integral  equation,  which  reconstructs  a  2-D  potential 
from  backscattering  data  over  a  plane,  seems  to  be  new. 

3.8.  2-D  Generalization  of  the  Kunetz  Condition.  Alternatively,  (3.21)  can 
be  regarded  [16]  as  a  set  of  multichannel  Yule- Walker  equations  for  computing  the  least- 
squares  multichEinnel  autoregressive  prediction  filter  coefficients  for  a  discrete-time  mul¬ 
tichannel  zero- mean  stationary  random  process  having  matrix  covariance  lags  R(t).  The 
covariance  matrix  of  the  multichannel  process  at  a  given  time  is  normalized  to  I.  The 
transmission  loss  faetor  T(n)  in  (3.21)  is  analogous  to  the  error  covarizmce  matrix  of  an 
n**-order  filter.  This  leads  directly  to  the  next  result. 

The  matrix  covariance  lags  of  a  multichannel  stationary  random  process  must  form  a 
positive  definite  sequence,  i.e.,  the  system  matrix  in  (3.21a)  must  be  positive  definite.  Since 
solution  of  the  inverse  scattering  problem  defined  by  (3.14)-(3.16)  (or  (3.20)  and  (3.16)) 
is  equivalent  to  the  solution  of  the  block  Toeplitz  system  (3.21a),  we  have  the  following 
result:  Let  i2(n,  fc*;  ibj.)  he  the  discrete-time  Fourier  transform  (taking  lateral  position  j 
into  kx)  of  the  free-surface  impulse  response  to  excitation  6(n)6{kx  —  fc*)  (see  (S.16))  of  a 
2-D  discrete  scattering  medium  described  by  the  2-D  discrete  Schrodinger  equation  (S.IS). 
Then  R(n,kx’,k'^)  must  be  a  positive  definite  matrix  function,  i.e.,  the  system  matrix  in 
(S.21a)  must  be  positive  definite. 

This  result  can  be  viewed  as  a  2-D  generalization  of  the  1-D  result  of  Kunetz  [7].  The 
result  of  [7]  was  extended  to  the  two-channel  case  (elastic  medium)  in  [8];  the  proof  in 
[8],  which  follows  closely  the  proof  of  the  scalar  result  in  [7],  easily  extends  to  the  ciise  of 
an  arbitrary  number  of  channels,  and  will  not  be  repeated  here.  As  in  the  1-D  case,  the 
free-stuface  reflection  response  is  in  fact  the  autocorrelation  of  the  transmission  response 
T(n,kx’,  k'^).  Furthermore,  a  result  from  multichannel  linear  prediction  theory  [16]  states 
that  the  system  matrix  in  (3.21a)  is  positive  definite  if  and  only  if  the  reflection  coefficients 
r(*)  have  maximum  singulsir  values  less  than  one. 

This  result  carries  the  same  significance  as  does  the  1-D  result.  If  this  property  does 
not  hold,  then  the  data  are  infeasible,  in  that  no  2-D  discrete  mediiim  described  by  (3.13) 


could  have  produced  it.  Then  some  r(t)  will  have  a  singular  value  greater  than  one,  and  the 
multichannel  Schur  algorithm  will  fail.  If  the  noisy  data  are  processed  so  that  it  is  positive 
definite,  then  the  multichannel  Schiur  algorithm  will  reconstruct  the  mediiun  associated 
with  the  noisy  data,  not  the  actual  medium,  but  it  will  not  diverge. 

3.9.  Structure  of  Matrices  r(i)  and  V(i).  In  this  section  we  discuss  the  structure 
of  r(x)  and  V(x),  with  regard  to  Hermitian  symmetry  and  diagonal+Toephtz  or  diago- 
nal+circxilant  structure. 

First,  note  the  above  results  depend  on  the  Hermitian  symmetry  of  r(i),  i.e.,  r(x)  = 
r{i)^ .  By  reciprocity,  we  have  that  R{n,kx;k'^)  =  i2(— n,  — fcj.;  — i:*),  since  the  lateral 
wavenumber  k^  specifies  the  direction  of  the  plane  wave  and  reversing  direc¬ 

tion  requires  changing  the  sign  of  both  k  (hence  the  sign  change  of  n)  and  fc*.  Then  each 
block  in  (3.21a)  R(m),-,j  =  R(m,iA;jA)  (see  (3.21b))  is  Hermitian,  since  the  sign  change 
of  n  produces  complex  conjugation  and  the  point  ( — j,  — t)  is  the  transpose  (about  the  line 
i  =  —j)  of  the  point  (t,  j). 

Alternatively,  the  coupled  wave  system  (3.20),  to  be  physically  meaningful,  must  have 
the  couphng  between  the  t**  and  j***  channels  be  the  complex  conjugate  of  the  coupling 
between  the  and  i**  channels.  This  is  a  direct  statement  of  Hermitian  symmetry  of 
reflection  coefficients  r(j). 

For  the  continuous  problem,  the  Hermitian  symmetry  of  r(z)  immediately  implies, 
using  (3.9),  that  the  potential  V(z)  is  also  Hermitian  (note  that  if  r  is  Hermitian,  then 
=  rr^  is  also).  This  is  less  apparent  for  the  discrete  problem;  however,  from  (3.18)  it 
is  clear  that  V(t)  is  Hermitian,  since  V{i,j)  is  real. 

Second,  a  significant  featme  of  V(t)  for  the  discrete  problem  (see  (3.18))  and  Viz)  for 
the  continuous  problem  (see  (3.8))  is  that  they  are  diagonal-j-Toeplitz  or  diagonal -j-circiilant 
(depending  on  the  discretization  used).  Since  they  are  also  Hermitian,  this  means  they 
have  only  N  degrees  of  freedom,  not  the  degrees  of  freedom  an  arbitrary  matrix  would 
have  (recall  that  the  diagonal  part  is  known).  We  have  not  yet  exploited  this  structure. 

Since  r(z)  is  defined  from  V(r)  using  (3.9),  and  r(t)  is  defined  from  V (?)  tising  (3.19a), 
it  is  clear  that  r(r)  and  r(»)  also  have  only  N  degrees  of  freedom.  This  implies  that  (3.12b) 
in  the  2-D  layer  stripping  ^dgorithm  or  multichannel  Schur  eilgorithm  may  be  replaced  by 


solution  of  rD  =  U,  where  D  and  U  are  vectors,  instead  of  matrices  (since  this  constitutes 
N  equations  in  N  unknowns).  The  significance  of  this  observation  is  that  since  different 
columns  of  D  and  U  correspond  to  different  experiments,  i.e.,  different  excitations,  probing 
at  normal  incidence  only  is  sufficient  to  reconstruct  the  medium. 

This  agrees  with  the  requirements  of  the  algorithm  of  [4]  aind  [5],  in  which  an  asym¬ 
metric  multichannel  two-component  wave  system  was  used.  The  wave  system  in  [4]  and  [5] 
had  V(z)  itself  as  one  of  the  scatterers;  since  V(2)  was  circulant  by  the  discretization  used 
therein,  VD  =  U  could  be  solved  using  the  discrete  Fourier  transform.  Unfortimately, 
the  reflection  coefficients  in  the  symmetric  two-component  wave  system  used  in  this  paper 
do  not  exhibit  their  structure  in  such  an  obvious  manner.  But  normal  incidence  alone  is 
sufficient  in  principle.  This  leads  to  the  following  result. 


4.  New  2-D  Results  for  Point-Source  Excitation. 

4.1.  Problem  Formulation.  The  problem  considered  in  this  section  is  as  follows. 
The  medium  is  again  described  by  the  Schrodinger  equation  (3.1),  with  the  same  assump¬ 
tions  on  the  scattering  potential.  However,  we  now  assume  a  free  (perfectly  reflecting) 
surface,  and  as  excitation  the  2-D  impulsive  point  source  (l(-)  is  the  unit  step) 


l(t  -  y/x^  -t-  z^) 


I  POO  fOO  POO  ik,t 


{2v 


=  (4-1) 

The  following  comments  are  appropriate  here: 

1.  Note  that  k,  is  vertical  wavenvunber; 

2.  Note  that  the  excitati<Hi  (4.1)  is  actually  a  line  source  in  3-D,  but  since  this  paper 
deals  exclusively  with  1-D  and  2-D,  we  call  it  a  p<^t  source; 

3.  Note  that  (4.1)  is  the  impulse  response  (Green’s  functi<m)  for  a  2-D  homogeneous 
TTipdinin.  That  is,  it  is  the  inverse  temporal  (i  —*  t)  Fourier  transform  of  the  solution 
to  the  Schrodinger  equation  (3.1)  if  V{x,z)  =  0; 

4.  Note  that  (4.1)  expresses  the  wave  field  in  the  time  domain  as  a  superposition  of  plane 
wave  basis  functions  e****c‘***e‘**. 


Taking  Fourier  trEinsfonns  in  the  lateral  variable  (i  — »  as  in  Section  3,  the  down¬ 
going  part  of  the  excitation  (4.1)  at  wavenumber  can  be  written  as  ‘  ,  which  can 

also  be  viewed  as  a  downgoing  solution  to  (3.3)  when  the  potential  V{z^kx)  =  0.  The 
upgoing  reflection  response  to  the  excitation  (4.1)  can  then  be  written  as  R{ki,  fcr) • 
That  is,  the  entire  wave  field  it(x,  2,  it)  in  the  Schrodinger  equation  (3.1)  at  the  free  surface 
2  =  0  can  then  be  written  as  the  superposition 


u(x,z,k)=:  j  \--—j^+R{.^z-,kx)—^^  +  R{kx,k)—^je'‘^dkx  (4.2) 


of  a  downgoing  incident  field  due  to  the  source  and  reflection  off  of  the  free  siuface,  and  an 
upgoing  scattered  field  measured  at  the  surface  2  =  0.  The  goal  is  to  reconstruct  potential 
V{x,z)  from  reflection  response  R(kzykx). 

4.2.  New  Results.  Formally  changing  variables  from  k  to  k^,  (3.3)  becomes 


tt(2,k^,k:c)=  f  V{z,kt  -  k'^)u(z,kzyk'^)dk'^,  (3.3) 

J  — oo 

where  u(z,k^ykr)  =  u(z,k  =  ^/k^T^,kt)  in  (3.4)  (we  do  not  bother  to  introduce  new 
notation  for  u(2,  fc*,  fc*)). 

From  this  point  forward,  we  may  proceed  exactly  as  in  Section  3,  with  (4.3)  replacing 
(3.3).  There  is  one  major  chainge:  The  -kl  term  in  (3.3)  has  disappeared,  absorbed  into 
kl  in  (4.3).  The  effect  of  this  change  is  that  F{kl)  defined  in  (3.5)  and  used  subsequently 
in  (3.6)  and  (3.7)  is  now  zero.  This  means  that  defined  in  (3.8)  no  longer  has  di- 

agonal+Toeplitz  or  diagonal-f  circulant  structure  (dei>ending  on  discretization;  s^  Section 
3.4),  but  has  purely  Toeplitz  or  circulant  structure. 

This  produces  an  important  simplification  in  the  results  of  Section  3.  Supjxtse  that  the 
discretization  used  produces  circxilant  structure  in  Then  the  reflection  coefficients 

r(i  j)  defined  in  (3.9)  will  also  be  circulant.  This  implies  that  the  discretized  reflection 
response  matrices  R(ifcz)(i,i)  =  =  (*  “  j)^)  are  circulant,  since  the  R(fez)  can 

be  generated  from  the  r  by  nmning  the  2-D  layer  stripping  algorithm  (3.12)  backwards. 
Finally,  this  shows  that  the  block-Toeplitz  system  of  equations  (3.21)  is  now  block-Toeplitz 
with  circulant  blocks,  so  it  can  be  assembled  from  R(^kg,  fcx),  even  though  these  have  fewer 
degrees  of  freedom  them  the  R{kykg‘,k'^)  used  before. 
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The  2-D  layer  stripping  algorithm  can  then  be  xised  to  reconstruct  the  circulant  re¬ 
flection  coefficient  matrices  from  the  circulant  reflection  response  matrices.  This  is  an 
improvement  over  the  result  in  Section  3,  since  less  data  are  needed  to  reconstruct  the 
potential.  The  reason  for  this  is  that  the  structme  noted  in  Section  3.9  is  brought  out 
by  taJdng  the  superposition  of  plane-wave  impulse  reflection  responses  (4.2),  rather  than 
considering  each  response  separately.  The  simplification  appears  in  the  multichannel  Schur 
algorithm  as  the  simplified  form  of  this  algorithm  as  apphed  to  matrices  with  Toeplitz- 
block-Toeplitz,  rather  than  merely  block- Toeplitz,  structme. 

Now  suppose  the  equations  derived  for  the  discrete  inverse  scattering  problem  are 
used.  Note  that  the  more  complicated  equations  (3.19)  relating  the  discrete  potential 
to  the  reflection  coefficients  still  preserve  the  circulant  structure,  as  circulant  matrices 
are  closed  imder  addition,  multiplication,  and  inversion.  This  also  leads  to  the  following 
second  2-D  generalization  of  the  Kimetz  result.  Let  R{kx,kx)  be  the  Fourier  transform 
of  the  free-surface  response  of  a  i-D  scattering  medium  described  by  the  2-D  Schrodinger 
equation  (S.l),  to  a  point-source  excitation,  as  specified  in  (4.1)-(4’^)-  Then  R{kt,kx) 
must  be  a  2-D  positive  definite  function,  i.e.,  the  system  matrix  in  (S.21a)  must  be  positive 
definite.  Note  that  this  differs  from  the  resffit  of  Section  3,  which  specified  a  multichannel 
(vector)  positive  definite  function-now  a  tnily  2-D  positive  definite  funciton  is  specified. 

5.  Numerical  Examples.  In  this  section  we  present  some  illustrative  numerical 
examples.  First,  we  demonstrate  the  algorithms,  and  show  that  they  provide  results  su- 
I)erior  to  results  using  the  Bom  ^proximation.  Next,  we  add  noise  to  the  data  to  render 
it  infeasible,  and  show  that  the  algorithms  fail,  as  expected.  Finally,  we  project  the  noisy 
data  onto  the  space  of  positive  definite  discrete  functions,  and  show  that  the  algorithms 
no  longer  diverge. 

6.  Conclusion. 
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APPENDIX  C 

A.E.  Yagle,  “Multiresolution  Algorithms  for  Solving  One-Dimensional  In¬ 
verse  Scattering  Problems  Using  the  Wavelet  Transform,”  revision  submitted 
to  IEEE  Trans.  Sig.  Proc.,  October  1993. 

This  paper  applies  the  wavelet  transform  to  the  1-D  inverse  scattering  problem.  We 
derive  a  layer  stripping  algorithm  that  uses  as  input  the  wavelet  transform  of  the  impulse 
reflection  response  of  the  medium.  The  wavelet  transform  decouples  the  layer  stripping 
algorithm  into  a  set  of  wave  systems  at  differing  wave  speeds.  Any  of  these  multiple- 
resolution  systems  could  be  used  to  reconstruct  the  medimn;  this  allows  some  flexibility 
on  how  the  data  axe  used. 

We  also  derive  both  a  layer  stripping  algorithm,  and  a  linear  system  of  equations,  in 
the  2-D  (time  and  space)  wavelet  transform  domain,  from  the  layer  stripping  algorithm  and 
Krein  integral  equation,  respectively.  These  results  show  how  data  at  one  resolution  affects 
the  reconstruction  at  another  resolution.  They  axe  interpreted  in  terms  of  fast  algorithms 
for  the  slanted  Toeplitz  structured  linear  system  of  equations.  The  Born  approximation  is 
also  derived  and  discussed. 
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Abstract 

The  wavelet  transform  is  applied  to  the  inverse  scattering  problem  of  reconstructing 
the  reflectivity  function  of  a  two-component  wave  system  from  its  impulse  reflection  re¬ 
sponse,  a  problem  which  has  applications  in  acoustics  and  some  synthesis  problems.  Three 
new  multiresolution  algorithms  are  obtedned.  First,  a  single  wavelet  transform  in  time  re- 
svdts  in  a  set  of  independent  wave  systems  in  which  waves  propagate  at  a  speed  determined 
by  the  dilation  factor.  Second,  wavelet  transforms  in  both  time  and  space  result  in  a  set  of 
coupled  wave  systems;  in  each  system  the  waves  propagate  at  a  speed  determined  by  the 
two  dilation  factors.  Finally,  wavelet  transforms  in  both  time  emd  space  are  applied  to  the 
Krein  integral  equation,  resulting  in  a  block-slanted- Toeplitz  linear  system  of  equations, 
to  which  the  coupled  wave  system  is  related.  The  latter  two  results  relate  the  wavelet 
transform  of  the  reflectivity  function  to  the  wavelet  tremsform  of  the  impulse  reflection 
response. 
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1.  INTRODUCTION 


The  mathematical  inverse  problem  of  reconstructing  a  one-dimensional  continuous 
layered  medium  from  its  impulse  reflection  response  has  many  applications  in  many  differ¬ 
ent  fields.  These  include  reflection  seismology  [1],  acoustic  measurement  of  the  shape  of  the 
human  vocal  tract  [2],  and  the  synthesis  of  nonuniform  transmission  hnes  [3].  AU  of  these 
problems  can  be  formulated  as  a  nonlinear  problem  of  reconstructing  a  spatially- varying  re¬ 
flectivity  function,  in  the  two-component  wave  system  (1)  below,  from  a  temporally- varying 
impulse  reflection  response  function.  Note  the  problem  is  nonlinear  due  to  multiple  scat¬ 
tering  in  the  wave  system;  these  effects  sire  included  throughout  this  paper,  unhke  some 
methods  which  ignore  multiple  scattering  (the  Bom  approximation). 

There  are  two  basic  approaches  to  solving  this  inverse  problem.  The  first  approsich 
is  to  solve  the  Krein  integral  equation  [4],  which  has  the  impulse  reflection  response  as  its 
kernel,  and  rezwi  off  the  reflectivity  function  from  the  solution.  The  second  approach  is  to 
use  a  layer-stripping  algorithm  [5], [6],  which  operates  recursively  in  space  by  differentially 
reconstructing  the  reflectivity  function,  and  then  differentially  propagating  the  waves  in 
(1).  Layer  stripping  algorithms  are  computationally  more  efficient  than  solving  integral 
equations.  The  relation  between  integred  equations  and  layer  stripping  algorithms  is  a 
continuous-parameter  analog  of  the  relation  between  ToepUtz  systems  of  equations  and 
the  Levinson  and  Schtir  fast  2ilgorithms  for  solving  such  systems  of  equations;  see  [6]  for  a 
detailed  treatment. 

The  wavelet  transform  has  recently  received  much  attention  from  the  signal  processing 
commimity  [7], [8].  The  wavelet  transform  is  a  multiresolution  decomposition  of  a  function 
in  terms  of  scalings  (contractions  and  dilations)  and  translations  of  a  wavelet  basis  func¬ 
tion,  which  is  localized  in  time  and  frequency.  It  thus  generates  a  time-frequency  (actually 
time-scale)  representation  of  the  function.  The  wavelet  transform  can  be  related  to  multi¬ 
rate  filtering,  subband  coding,  and  quadrature-mirror  filtering.  Two  important  papers  on 
wavelets  axe  [9]  and  [10];  the  list  of  references  in  [7]  gives  some  idea  of  the  enormous  amoimt 
of  interest  and  applications  the  wavelet  transform  has  recently  generated.  No  attempt  will 
be  made  here  to  summarize  idl  of  the  recent  work  on  wavelets. 


This  paper  applies  the  wavelet  transform  to  the  inverse  problem  of  reconstructing 
the  reflectivity  fimction  of  a  two-component  wave  system  (1)  from  its  impulse  reflection 
response.  Since  this  inverse  problem  is  nonhnear,  the  resulting  multiresolution  algorithms 
are  complicated;  however,  they  have  interesting  interpretations,  both  physical  and  in  terms 
of  fast  algorithms.  The  significance  of  these  results  is  as  follows:  (1)  the  algorithms  provide 
a  space-time-frequency  (Section  III)  or  space- wavenumber-time-frequency  (Sections  IV  and 
V)  representation  of  the  inverse  problem  solution  process;  (2)  the  algorithm  of  Section  III 
operates  even  in  the  presence  of  time-and-frequency  support-limited  noise  or  interference, 
unlike  previous  approaches;  (3)  the  algorithms  of  Sections  IV  and  V  show  how  the  data 
at  one  resolution  is  coupled  to  the  solution  at  another  resolution;  and  (4)  the  new  coupled 
wave  systems  axe  related  to  multichannel  coupled  lattice  structures  associated  with  the 
new  block-slant-Toeplitz  structured  linear  system  of  equations  derived  in  Section  V.  New 
contributions  of  this  paper  include  all  of  the  results  of  Sections  III-V. 

This  paper  is  organized  as  follows.  In  Section  II  we  quickly  review  the  two-component 
wave  system  inverse  scattering  problem,  along  with  some  of  its  applications,  the  layer 
stripping  algorithm  for  solving  this  problem,  and  the  Krein  integral  equation  for  solving  this 
problem.  We  also  quickly  review  the  discrete  wavelet  expansion  of  a  continuous  fimction  in 
terms  of  wavelet  basis  functions  which  are  orthonormal  at  different  scales  and  translations. 
In  Section  III  the  wavelet  transform  with  respect  to  time  is  applied  to  the  layer  stripping 
algorithm,  resulting  in  a  set  of  independent  wave  systems  in  which  the  waves  propagate 
at  a  speed  determined  by  the  dilation  f2ictor.  This  allows  a  different  wave  system  to  be 
used  to  compute  the  reflectivity  function  at  each  depth,  reducing  the  effects  of  noise  in  the 
impulse  reflection  response. 

In  Section  IV  wavelet  transforms  with  respect  to  both  time  and  space  are  applied  to 
the  layer  stripping  algorithm.  This  results  in  a  set  of  coupled  wave  systems;  in  eeich  system 
the  waves  propagate  at  a  speed  determined  by  the  two  dilation  factors.  This  allows  the 
reflectivity  function  to  be  reconstructed  at  different  resolutions,  from  the  impulse  reflection 
response  at  different  resolutions  (although  not  independently).  We  also  specialize  to  the 
specific  results  when  a  Haar  wavelet  basis  function  is  used,  and  interpret  the  resulting 
algorithm  in  the  Bom  approximation.  In  Section  V  wavelet  transforms  with  respect  to 
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both  time  and  space  are  applied  to  the  Krein  integral  equation.  This  results  in  a  block- 
structured  linejir  system  of  equations  in  which  each  block  has  a  slanted-ToepHtz  structure, 
viz.  the  elements  of  a  given  block  sire  constant  along  diagonals  with  slope  determined  by 
the  location  of  the  block.  This  structure  is  related  to  the  form  of  the  multichannel  coupled 
wave  system  derived  in  Section  IV.  Section  VI  concludes  the  paper  with  a  summary. 

II.  QUICK  REVIEW  OF  INVERSE  SCATTERING  AND  THE  WAVELET  TRANSFORM 
A.  The  1-D  Inverse  Scattering  Problem 


The  basic  one- dimensional  inverse  scattering  problem  considered  in  this  paper  is  as 
follows.  Let  x  be  a  spatial  variable,  which  we  will  csill  depth  (increasing  downward),  and 
t  be  time.  A  scattering  medium  is  described  by  the  two- component  wave  system 


d 

d(x,t) 

dx 

u{x,t) 

a 

'Tt 


ft  \ 


-r{x) 
a_ 
at 


d{x,t) 

u{x,t) 


0  <  X  <  1 


(1) 


where  the  reflectivity  function  r(x)  characterizes  the  scattering  medium.  The  scattering 
medium  is  thus  a  chain  or  series  of  differential  scattering  sections,  each  having  the  form 
shown  in  Figme  1  (note  the  lattice  structure  of  the  mediiim,  similar  to  the  lattice  filters 
of  linear  prediction). 

Note  that  if  r(x)  =  0,  then  d{x,t)  =  d(x  —  t)  and  u(x,t)  =  u(x  -f  t).  Thus  d(x,t)  and 
u{x,t)  can  be  interpreted  as  Downgoing  and  Upgoing  waves,  scattered  into  each  other  at 
depth  X  by  the  reflectivity  function  r(x).  The  scattering  medium  is  assumed  to  have  finite 
extent  in  x;  without  loss  of  generality  this  extent  is  scaled  to  0  <  x  <  1.  The  boimdary 
conditions  are  a  raudiation  condition  at  x  =  1  (d{x,t)  =  d{x  —  t)  and  «(x,<)  =  0  for  x  >  1) 
and  a  free  surface  at  x  =  0  (d(0,  t)  =  u(0,  t),  excluding  sources).  The  firee  stirfzice  impUes 
that  an  upgoing  wave  at  x  =  0  is  simply  reflected  into  a  downgoing  wave. 

This  scattering  medium  is  probed  with  an  impulsive  plane  wave  S{t  —  x),  which  prop¬ 
agates  downwzurd  into  the  medium  in  increasing  depth  x  as  time  t  incre^es.  The  reflection 
response  k(t)  of  the  medium  to  this  impulse  is  measured  at  x  =  0.  This  amoimts  to 
initializing  (1)  with 

d(0,<)  =  6(<) -f- A;(t);  u(0,<)  =  fc(t).  (2) 
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The  inverse  scattering  problem  is  then  to  compute  the  reflectivity  function  r(x)  from  the 
impulse  reflection  response  k(t). 

Several!  types  of  inverse  problems  can  be  formulated  as  the  above  problem  [l]-[3].  For 
example,  if  the  scattering  medium  is  a  continuously-layered  acoustic  medium  with  constant 
wave  speed  and  varying  density  p(x),  then  the  problem  of  reconstructing  p(x)  from  the 
reflection  response  of  the  medium  to  an  impiilsive  plane  wave  S(t  —  x)  can  be  formulated 
as  (1)  by  defining  [5] 

d(x,t)  =  +  x/^v(x,t);  u(x,t)  =  -  \/^v(x,t);  r(x)  =  (3) 

Vp(x)  Vp(x)  2  dx 

where  p(x,t)  is  pressme  in  the  medium  and  v(x,i)  is  velocity  of  the  medium.  In  other 
appHcations  p(x)  is  replaced  with  local  impedance  of  a  nonimiform  transmission  line  [2]  or 
cross-sectional  area  of  the  human  vocal  tract  [3].  See  the  references  in  [6]  for  more  details. 

B.  Layer  Stripping  Solution  to  the  1-D  Inverse  Scattering  Problem 

It  is  clear  from  the  forms  of  (1)  and  (2)  that  d{x,t)  and  u(x,<)  are  causal  functions, 
i.e.,  d{x,i)  =  u(x,t)  =  0  for  <  <  x.  Discretizing  depth  x  and  time  t  into  integer  multiples 
of  a  small  constant  A  and  using  forwrird  differences  to  approximate  the  partial  derivatives 
in  (1),  the  two-component  wave  system  (1)  discretizes  into  [5], [6] 

d{x  +  A,t  +  A)  _ 
u(x  +  A,t  —  A) 

r(x)A  =  u{x,x)/d{x,x) 

Equation  (4b)  follows  from  setting  t  =  x  in  (4a)  and  noting  that  u(x  -f  A,  x  —  A)  =  0  by 
causality.  For  A  <<  1  we  have  d{x,x)  «  1. 

Equations  (4),  initialized  using  d(0,t)  =  u(0,t)  =  k{t),  can  be  recursively  propagated 
in  increasing  depth  x,  recovering  the  reflectivity  function  r(x)  along  the  way.  This  is  the 
essential  idea  of  a  layer  stripping  algorithm.  Recall  that  by  causality  d(x,  t)  =  u(x,  t)  =  0 
for  t  <  X,  so  (4a)  is  only  propagated  for  t  >  x;  at  t  =  x  the  second  equation  of  (4a)  is 
equivalent  to  (4b).  Also  note  that  d(x  =  mA,  t  =  nA)  =  u(x  =  mA,  t  =  nA)  =  0  if  m  -|-  n 


1  -r(x)A' 

d{x,t)' 

— r(x)A  1 

u(x,<) 

(4a) 


r/A— 1 


'  i=0 


(46) 
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is  odd.  The  nonlinear  multiple  reflections  axe  taken  into  account  by  the  nonlinear  product 
terms  in  (4a).  Such  algorithms  have  been  derived  in  many  forms  for  many  problems  [1],[6]. 

We  recognize  (4)  aa  the  Schur  algorithm,  of  signed  processing.  This  algorithm  has  a 
long  and  rich  history  dating  back  to  1917;  see  [11].  Note  that  (4)  has  the  nice  physical 
interpretation  of  reconstructing  a  discrete  layered  medium,  consisting  of  homogeneous 
layers  of  thickness  A.  It  arises  here  because  such  a  discrete  layered  medium  can  also  be 
reconstructed  by  solving  a  Toeplitz  or  Hankel  system  of  equations;  the  Schur  algorithm 
here  is  simply  a  fast  algorithm  to  obtain  the  r(a:)A  [12]. 

C.  Integral  Equation  Solution  to  the  1-D  Inverse  Scattering  Problem 

An  alternative  to  the  layer  stripping  algorithm  (4)  is  to  solve  the  Krein  integral  equa¬ 
tion  [4] 

k(x -t)  =  h{x,i) J  h(x,z)k(\z -t\)dz‘,  |t|  <  x;  0<r<l  (5) 

for  h(x,t).  r(x)  can  then  be  computed  from  h(x,t)  using 

r(x)  =  2h(x,—x).  (6) 

Note  the  Toeplitz  structure  in  the  kernel  A;(lz-t|)  of  (5).  This  structme  shows  why,  from  a 
fast  algorithms  perspective,  the  Schur  algorithm  (4)  can  be  used  to  reconstruct  r{x)  from 
k{t)  more  quickly  than  solving  the  integrzil  equation  (5)  by  discretizing  it  into  a  Toephtz 
system  of  equations. 

We  recognize  (5)  as  the  Wiener- Hopf  integral  equation  for  computing  the  linear  least- 
squares  Alter  h{x,  t)  for  estimating  a  zero-mean  wide-sense  stationary  random  process,  with 
covariance  function  fc(|x  — 1|),  at  time  x  from  noisy  observations,  with  additive  white  noise, 
measmred  over  the  interval  —x<t<x.  Equation  (6)  then  merely  states  the  well-known 
result  of  linear  prediction  that  the  reflectivity  function  (continuous  reflection  coeflScient) 
equals  the  filter  weight  at  the  far  end  of  the  interval  of  observation.  This  illustrates  the 
well-known  connection  between  linear  prediction  and  inverse  scattering. 

Applying  the  operator  (^  +  ^)  to  (5)  and  using  linearity  and  imiqueness  of  the 
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solution  of  (5)  results  in  the  Krtin-Levinson  equation  [13] 

0  =  -rix)h{x,  -i);  \t\  <  x  (7a) 

r(x)  =  h(x,—x)  =  k(2x)  —  J  h(x,  z)k{x  +  z)dz.  (7b) 

Discretizing  (7)  in  the  same  way  that  (1)  was  discretized  to  (4a)  results  in  the  familiar 
Levinson  algorithm  for  solving  the  discretized  integral  equation  (6).  Note  that  (7b)  dis¬ 
cretizes  into  the  “inner  product”  computation  of  the  Levinson  algorithm.  See  [6]  and  [12] 
for  more  details.  Note  also  that  (7a)  heis  the  same  form  as  the  first  equation  of  (1).  This 
illustrates  that  the  form  of  the  lattice  equations  (1)  is  related  to  any  solution  procedure 
for  the  inverse  scattering  problem;  see  [6]  and  [12]. 

We  note  in  passing  that  the  two-component  wave  system  (1)  inverse  scattering  problem 
can  be  transformed  into  a  Schrodinger  equation  inverse  scattering  problem,  and  vice- 
versa,  provided  the  functions  axe  sdl  smooth  enough.  The  relations  between  the  well- 
known  Schrodinger  equation  inverse  scattering  machinery,  such  as  the  Gel’fand-Levitan 
and  Marchenko  integral  equations,  and  the  Krein  integral  equation  and  layer  stripping 
algorithm,  are  discussed  in  [6].  One  important  advantage  of  the  two-component  wave 
system  (1)  over  the  Schrodinger  equation  is  that  double  differentiations  are  not  required. 

If.  Discrete  Orthonormal  Wavelet  Transforms  of  Continuous  Functions 

The  discrete  orthonormal  wavelet  transform  or  representation  F(m,  n)  of  a  continuous 
square-integrable  function  /(x)  is  [9], [10] 


F(m, 

n)  = 

r  f{x)2'^/'^(f>{2^x-n)dx 

/ — OO 

(8a) 

fix)  = 

OO 

E 

OO 

Y,  F(m,n)2”‘/V(2"*x-n) 

(86) 

oo  n=— oo 


where  <f>{x)  is  the  wavelet  basis  function,  ^(x)  has  the  properties  that  it  is  orthogonal 
(in  the  sense  of  the  xisual  inner  product)  to  its  scalings  <^(2"*x)  (dilations  for  m  <  0; 
compressions  for  m  >  0)  and  to  the  translations  <f>(2'^x—n)  of  its  scalings,  zind  the  set  of  all 
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scalings  and  translations  {2'"/^<^(2’"x-n);  m,  n  €  integers}  forms  a  complete  orthonormal 
set  (the  factor  2"*/^  is  necessary  for  ortho- normality).  Algorithms  for  obtaining  suitable 
basis  functions  <f>(t)  are  discussed  in  [9]  and  [14]. 

The  significance  of  the  wavelet  transform  or  expansion  (8b)  is  as  follows.  Coarse 
scalings  (small  m)  amotmt  to  convolving  f{x)  with  a  basis  fimction  that  is  drawn  out  in 
time,  and  then  sampling.  Equivalently,  f{x)  itself  can  be  viewed  as  being  compressed,  and 
then  filtered  with  <i>{x).  Either  way,  there  is  clearly  a  loss  of  resolution-only  long-scale 
featmes  of  f(x)  are  captured  in  F(m,n)  when  m  is  small.  On  the  other  hand,  fine  scalings 
(large  m)  am  mint,  to  convolving  f(x)  with  a  tightly  compacted  basis  fimction,  and  then 
sampling,  so  that  the  resolution  is  quite  high.  Furthermore,  since  the  basis  function  is  so 
tightly  compacted,  the  high  resolution  is  local,  in  that  the  wavelet  transform  at  a  given 
point  depends  only  on  fix)  in  the  immediate  vicinity  of  that  point.  Thus  the  fine  scale 
information  is  localized,  while  the  coarse  scale  information  is  global. 

Because  the  coarse  scale  wavelet  transform  carries  only  low-frequency  (long-scale) 
information,  it  may  be  sampled  coarsely,  with  long  intervals  between  the  sample  points. 
The  fine  scale  wavelet  transform  carries  high-resolution  (small-scale)  information,  and  so 
it  must  be  sampled  finely.  It  can  be  shown  that  the  szimpling  grid  shown  in  Figure  2  is 
sufficient  to  completely  represent  /(x),  provided  (j>ix)  is  chosen  properly  [14].  Furthermore, 
each  fine-scale  sample  will  carry  localized  information  about  the  local  fine-sceile  behavior 
of  fix).  If  the  basis  function  <^(x)  is  localized  in  both  time  and  frequency,  the  wavelet 
transform  will  be  a  type  of  time-frequency  representation  of  fix).  In  [7]  wavelet  analysis 
is  noted  as  being  analogous  to  a  microscope-at  low  magnifications,  most  of  the  object  is 
visible,  while  at  high  magnifications,  only  a  small  part  of  the  object  is  visible,  and  many 
translations  of  it  are  needed  to  view  all  of  the  object. 

The  simplest  example  of  a  wavelet  transform  is  to  choose  ^(x)  as  the  Haar  basis 
function,  shown  in  Figure  3.  The  wavelet  transform  (8a)  of  fix)  is  simply  the  Haar 
transform  or  expansion  of  fix).  The  algorithm  of  Section  IV  will  be  specieJized  to  this 
particular  expansion;  other  choices  of  wavelet  basis  function  ^(x)  are  also  possible.  In 
general,  the  wavelet  transform  of  /(x)  can  be  computed  quickly  using  filter  banks.  FineiUy, 
we  mention  in  passing  that  there  are  many  types  of  wavelet  transforms;  in  this  paper,  om 
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attention  is  restricted  to  (8). 
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III.  WAVELET  TRANSFORM  IN  TIME  ONLY 

A.  Derivation  of  Wavelet  Layer  Stripping  Algorithm 

In  this  section  we  show  how  r(x)  can  be  differentially  reconstructed  from  the  wavelet 
transform  of  jfc(x).  The  wavelet  transform  of  (4)  results  in  sld.  imcoupled  set  of  discrete 
wave  systems,  each  of  which  may  be  propagated  separately.  This  results  in  some  flexibility 
on  how  layer  stripping  is  used. 

Let  <f(x)  be  any  wavelet  basis  function  with  compax:t  support  on  the  interval  0  <  x  <  1 
(in  this  section  (j)(x)  need  not  be  an  orthonormal  basis  function).  Apply  the  wavelet  trans¬ 
form  (8a),  with  X  replaced  by  t,  to  equations  (4).  Note  that  now  t  is  NOT  discretized 
to  integer  multiples  of  A;  it  is  a  continuous  independent  variable.  However  x  is  still  dis¬ 
cretized.  This  yields  the  single  wavelet  transform  (in  time  only)  layer  stripping  algorithm 

'  D{x  +  A-,m,n  +  2^A) 

U{x  4-  A;  m,  n  —  2*”  A) 

r(x)A  =  U{x\m,n  =  2’"x  —  l)fD{x-,m,n  =  2”*x  —  1).  (96) 

where  D(x;  m,  n)  and  U{x]  m,  n)  are  the  single  wavelet  transforms  (in  time  only)  of  d(x,  t) 
and  u(x,t),  respectively. 

Equation  (9b)  is  derived  as  follows.  Recall  that  the  wavelet  basis  function  <f>{x)  has 
compact  support  on  the  interval  0  <  x  <  1,  and  that  by  causality  d{x,t)  =  u(x,t)  =  0  for 
t  <  X.  These  imply  that  D(x;  m,  n)  =  17(x;  m,  n)  =  0  for  n  <  2”*x  —  1  (the  —1  comes  from 
the  length  of  the  support  of  ^(x)).  Setting  n  =  2"*x  —  1  <  2"*(x  -f  A)  —  1  in  the  second 
equation  of  (9a)  then  yields  (9b). 

Equation  (9)  is  initialized  using  l?(0;m,n)  =  U{0\m,n)  =  K(m,n),  where  K(m,n) 
is  the  wavelet  transform  of  the  impulse  reflection  response  k{t).  Like  (4),  equation  (9) 
can  be  recursively  propagated  in  increasing  depth  x,  recovering  the  reflectivity  function 
r(x)  along  the  way.  Since  D(x;m,n)  =  U{x]m,n)  =  0  for  n  <  2"*x  —  1,  equation  (9)  is 
only  propagated  for  n  >  2’”x  —  1,  so  the  size  of  the  shift  in  the  translation  variable  n  is 
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appropriate. 


B.  Comments 


Although  equations  (9)  look  similar  to  (4),  there  are  several  important  distinctions 
between  the  two  algorithms.  We  summarize  those  distinctions  here: 

1.  When  (4)  is  actually  implemented  on  a  computer  as  the  Schur  algorithm,  x  and  t 
are  discretized  to  integer  multiples  of  A.  In  deriving  (9)  from  (4),  t  is  treated  as  a 
continuous  independent  variable,  so  that  the  wavelet  trEinsform  with  respect  to  t  of 
(4)  can  be  taken; 

2.  The  wavelet  transform  breaks  up  the  continuous  (in  t)  equations  (4)  into  a  set  of 
completely  decoupled  discrete-space  discrete- time  wave  systems.  The  wave  speed  of 
the  wave  system  at  resolution  m  is  2“"*.  This  explains  why  D{x;  m,  n)  and  U{x\ m,  n) 
are  zero  for  n  <  2”*a:  -  1,  while  d{x,  t)  and  u(x,  i)  are  zero  only  for  t  <  x-the  finer  the 
scale,  the  slower  the  wave  speed; 

3.  Note  that  fractional  values  of  the  translation  index  n  are  required  for  scales  m  such 
that  2*"  A  is  not  an  integer;  it  is  suggested  that  A  =  2~^  for  some  integer  M.  This 
does  not  create  any  diffictdties;  to  interpret  this,  see  Figme  2.  Fractional  values  of  the 
translation  index  n  simply  correspond  to  samples  taken  in  between  the  sample  points 
of  Figure  2;  in  essence,  we  have  oversampled,  redimdant  wavelet  representations  of 
d{x,t)  and  u(x,<),  for  each  x; 

4.  Note  that  save  for  the  wave  speed  of  2“”*  ^  1,  equations  (9),  for  e2ich  m,  have  the 
same  form  as  the  Schur  algorithm.  The  wavelet  transform  decouples  the  original  layer 
stripping  algorithm  into  a  set  of  Schur-like  algorithms.  However,  the  original  layer 
stripping  algorithm  requires  that  k{t}  be  discretized  in  t  to  create  a  fully  discrete 
algorithm,  while  the  wavelet  layer  stripping  algorithm  uses  the  continuous  function 
k{t),  expanded  into  a  set  of  discrete  time-frequency  coefficients,  each  of  which  are  then 
propagated  in  Schur-like  algorithms; 

5.  As  a  result,  the  Schur-like  recxxrsions  (10)  can  be  interpreted  as  solving  a  set  of  mul¬ 
tiresolution  discrete  inverse  scattering  problems.  They  can  be  replaced  by  discrete 
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Riccati  equations,  interpreted  as  fast  algorithms  for  solving  discretized  integral  equa¬ 
tions,  viewed  as  representing  discrete  transmission  lines,  etc.  [12]. 

C.  Implementations  of  Wavelet  Layer  Stripping  Algorithm 

The  most  significant  aspect  of  (9)  is  the  flexibility  afforded  by  the  complete  decoupling 
of  the  discrete  wave  systems.  In  order  to  reconstruct  r(x)  from  the  wavelet  expansion 
K{m,n)  of  the  impulse  reflection  response  k{x),  we  may  do  any  of  the  following: 

1.  Select  any  single  scale  m,  and  reconstruct  r(x)  using  solely  the  wave  system  at  that 
scale.  This  requires  K(m,n)  for  a  single  m;  fractional  values  of  n  will  be  required  if 
2'”A  is  not  an  integer.  This  is  not  a  problem,  since  we  are  given  k{x)  as  data,  so  we 
can  compute  its  wavelet  transform  for  any  m  and  n. 

This  can  be  understood  as  follows.  At  coarse  scales,  the  coarse  sampling  will  not 
fiirnish  sufficient  information  to  reconstruct  r(x)  discretized  to  integer  multiples  of  A.  At 
finer  scales,  the  fine  sampling  furnishes  more  that  sufficient  information,  and  some  samples 
are  skipped-if  A  =  2“^  and  m  >  M,  only  every  sample  (in  n)  of  iiC(m,  n)  will  be 

used. 

This  choice  should  be  made  if  interference  or  noise  in  the  data  K{m,n)  is  present  on 
most  scales,  but  small  or  absent  on  a  specific  scale.  Note  that  the  wavelet  basis  function 
can  be  selected  to  bring  about  this  situation,  if  possible, 

2.  We  may  start  off  using  a  single  scale  mi,  eind  then  at  depth  x  =  x,  switch  to  a  coarser 
scale  m2  <  mi.  This  can  be  accomplished  by  subsampling  the  finer  sczde  wavelet 
coefficients  to  obtain  the  coarser  scale  coefficients  [9].  Since  we  are  given  fc(x),  we 
know  K{mi,n),  and  thus  U(x,;mi,n)  and  ?7(x,;mi,n),  for  all  n. 

This  choice  should  be  made  if  interference  or  noise  in  the  data  is  localized  in  the  wavelet 
transform  domain,  i.e.,  present  at  scale  mi  and  absent  at  scale  m2  for  some  translations, 
and  then  absent  at  scale  mi  and  present  at  scale  m2  for  further  translations. 

3.  We  can  propagate  all  of  the  scales,  and  compute  r(x)  at  a  given  depth  x  by  performing 
a  least-squares  fit  to  (9b),  analogous  to  what  was  done  in  [5].  That  is,  we  determine 


the  value  of  r(x)  that  minimizes 

^[r(x)A  —  U{x;m,n  =  2"*x  —  1)/Z)(x;m,n  =  2"*x  —  1)]^; 

m 

this  of  course  is  simply  the  mean  of  the  U(x',m,n  =  2’"x  —  1)/ D{x',m,n  =  2'"x  —  1). 
This  choice  should  be  made  if  interference  or  noise  in  the  data  is  present  at  all  scales. 
Since  interference  will  have  different  effects  on  different  scales,  the  various  r(x)  computed 
from  discrete  wave  systems  at  different  scales  m  will  differ;  using  all  of  the  scales  will  result 
in  an  improved  estimate  of  r(x),  as  in  [5]. 


D.  Comparison  Between  Wavelet  and  Frequency  Domain  Implementations 

It  may  seem  surprising  that  a  wavelet  expansion  of  k(t')  can  furnish  different  infor¬ 
mation  about  r(x)  at  different  scales-we  seem  to  be  exploiting  redundant  information,  so 
that  there  should  be  no  difference  in  results  at  different  scales.  To  understand  how  this 
can  happen,  take  the  temporal  Fourier  transform  of  (1).  This  yields  the  differential  Schur 

recursions  [6] 

d  d{x,u>) 
dx  [ti(x,a;) 

where  d{x,u;)  =  /“  d(x,t)e-^'^^dt,  and  similarly  for  u(x,u;). 

Now  suppose  (10)  is  initialized  at  the  surface  x  =  0  with  J(x,w)  u(x,u;)  k(uj). 
The  reflectivity  function  r(x)  can  then  be  computed  from  u(x,u;)  using  either  of  two 
expressions  [6]; 

r(x)  =  -  r  u(x,u)e^'^^duj  =  Urn  2;u;e^‘^*u(x,u;).  (11) 

The  signiflcance  of  (10)  is  that  each  spectral  component  is  propagated  separately-there 
is  no  mixing  of  components  at  different  frequencies  u.  This  is  not  surprising— viewed  as 
functions  of  time  alone,  (1)  are  linear  (there  are  no  products  of  two  functions  of  t).  This  is 
analogous  to  the  temporal  wavelet  treinsform  of  (4)  resulting  in  a  decoupled  set  of  discrete 
wave  systems,  in  which  each  scale  or  resolution  propagates  separately. 

The  significance  of  (11)  is  that  once  the  equations  have  been  propagated  to  depth 
X  r(x)  may  be  reconstructed  either  from  all  of  the  components,  or  from  just  the  highest 


-joj 

-r(x)' 

d{x,u}) 

j(jj 

u{x,uj) 

(10) 


# 
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# 
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frequency  componenU.  That  is,  the  highest-frequency  components  of  (10)  propagate  com¬ 
pletely  independently  of  other  components,  since  the  second  equality  of  (11)  may  be  used 
to  determine  r(i)  independently  of  the  other  components.  Thus,  even  if  there  is  bandlim- 
ited  noise  in  the  data,  r(x)  can  still  (in  principle)  be  reconstructed  without  error-simply 
propagate  only  the  highest-frequency  components. 

Something  analogous  to  (10)  and  (11)  is  happening  in  (9a)  and  (9b).  The  differences 
are  as  follows.  First,  since  the  wavelet  transform  is  redundant,  each  scale  can  be  propagated 
independently  of  other  scales,  rather  than  just  the  highest-frequency  components.  Second, 
and  more  important,  the  wavelet  transform  allows  use  of  representations  other  than  those 
in  terms  of  sinusoids.  By  choosing  a  proper  wavelet  basis  function,  interference  can  be 
confined  to  only  some  scales,  or  features  of  k(t)  can  be  eflBiciently  represented,  etc. 

IV.  WAVELET  TRANSFORM  IN  SPACE  AND  TIME-LAYER  STRIPPING 

A.  Derivation  of  Coupled  Discrete  Wave  Systems 

In  this  section  we  extend  the  previous  result  to  the  problem  of  reconstructing  the 
wavelet  transform  of  r(x)  directly  from  the  wavelet  transform  of  k{x).  That  is,  we  explore 
the  relation  between  r(z)  at  different  scales  and  k{x)  at  different  scales.  Since  the  problem 
is  nonlinear  in  variable  x,  due  to  the  r(x)«(x,  <)  and  r(x)d(x,t)  terms,  we  expect  this 
relation  to  be  complicated,  and  it  turns  out  to  be  so.  The  result  is  again  a  set  of  discrete 
wave  systems,  which  are  now  coupled.  However,  an  approximate  algorithm  is  still  possible. 

Let  d>(x)  be  an  orthonormal  wavelet  basis  function  orthogonal  to  its  scalings  and 
translations  (alternatively,  ^(x)  may  have  “tight  fr2ime”  [14])  such  that  the  set  of  its 
scalings  and  translations  form  a  complete  set,  and  having  compact  support  on  the  interval 
0  <  X  <  1.  Obviously  the  Haar  basis  function  would  be  suitable  here.  Define  the  following 
two- dimensioned  wavelet  expansions: 

/oo  ^OO 

I  <i(x,<)2”*^/^^(2”*^x  —  ni)2"**/^^(2”*’t  —  n2)dxdt] 

■OO  j  —  OO 

mi  Til  mj  nj 
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/oo  ^oo 

/  -  ni)2'"*/V(2'"’i  -  n2)^ixc?i; 

■OO  ^  “CO 

«(^.o  =  EEEE  C/(mi,ni,m2,n2)2’"‘/V(2"’‘x  -  ni)2"'^/V(2'"^i  -  ^2)  (126) 

mi  Til  m3  n2 

R{m3,n3)=  j  r(x)2'”®/^<?!>(2”*®x-n3)ffx;  r(x)  =  ^  i2(m3,  n3)2'”^/^(^(2^®x-n3). 

''—°c  ni3  ns 

(12c) 

We  use  odd-subscripted  variables  mi,  m3,  ms,  etc.  to  denote  spatial  variables,  and  even 
subscripted  variables  m2,  m4,  me,  etc.  to  denote  temporal  variables,  when  spatial  and 
temporal  variables  are  both  present. 

Also  define  the  discrete  function 


J  — OO 

(13) 

It  is  clear  that  ^(mi,  ni,  m2,  n2,  m3,  ns)  is  entirely  symmetric  with  respect  to  permutations 
of  the  subscripts  {1, 2, 3}.  Suppose  without  loss  of  generality  that  we  have  mi  <  m2  <  m3. 
Since  ^(x)  has  finite  support  on  the  interval  0  <  x  <  1,  it  is  straightforward  to  show  that 

C'(mi,ni,m2,n2,m3,n3)  =  0  unless  2<’”’-'"‘)ni  <  n2  <  2(”’»-’”‘)(ni  -H  1)  -  1 

and  2('”<'“'"*>n2  <  ng  <  2(”*»"'"*)("2  +  1)  -  1 

(implies)  2<”*»-”“>ni  <  ng  <  2('"»-'"i>(ni  +  1)  -  1 

(14) 

for  any  wavelet  basis  function  Usually  (14)  simplifies  considerably  (see  Section  C 

below). 

Now  insert  the  wavelet  expansions  in  (12)  into  the  layer  stripping  equations  (4)  (now 
regarding  both  x  and  t  as  continuous  independent  variables),  and  apply  the  wavelet  trans¬ 
form  (8a)  twice,  first  in  x,  and  then  with  x  replziced  by  t,  to  the  result.  Using  (12)  and 
(13),  this  yields 


D{mi,ni  -l-2"*^A,m2,n2  +  2”**A)  =  I>(mi,ni,m2,n2) 

”  ^(mi, ni, m3, ng, ms, n5)Ai2(m3, n3)l7(m5, ns, m2, n2);  (15a) 

ms  ^8  *^8  ^8 

U(mi,ni  +  2"“A,m2,n2  -  2”** A)  =  U(mi,ni,m2,n2) 


# 


# 
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-EEEE  C(mi,  ni,  m3,n3,  ms,  n5)Ai?(m3,  n3)D(m5,  ns,  m2,  n2).  (156) 

ms  113  m*  ns 

For  small  A  we  have  d(x,x)  =  1  in  (4b)  to  second  order  in  A  ((11)  also  states  this).  Now 
set  t  =  X  in  the  expansion  (12b),  apply  the  wavelet  transform  (8a),  and  use  (12c)  and  (13). 
This  yields 

i?(mi,ni)A  =  EEEE  ^(mi,  ni,  m2,  n2,  m3,  n3)17(m3,  n3,  m2,  n2).  (16) 

1712  ^2  ^8  ^8 

Alternatively,  we  can  avoid  the  (good)  approximation  d{x,x)  «  1  by  using  an  argu¬ 
ment  analogous  to  the  one  used  to  derive  (9b).  Recall  that  the  wavelet  basis  fimction 
(f>(x)  has  compact  support  on  the  interval  0  <  x  <  1,  aind  that  by  causality  d{x,t)  = 
u(x,t)  =  0  for  t  <  X.  These  imply  that  D(mi,ni,m2,n2)  =  U(mi,ni,m2,n2)  =  0  for 
2“'”^n2  <  2“"*‘(ni  —  1)  (this  can  be  seen  by  considering  the  two-dimensional  (x,<)  plane; 
again  the  —1  comes  from  the  length  of  the  support  of  <l>{x)).  Setting  n2  = 
in  (15b)  then  yields  a  linear  system  of  equations  for  i?(mi,ni)  (note  from  (14)  that 
C(mi,ni,m2,n2,m3,n3)  is  nonzero  only  for  certain  n,).  However,  this  approach  does 
not  seem  to  be  warranted,  due  to  its  complexity;  we  mention  it  only  due  to  similarity  to 
the  computation  of  the  matrix  reflection  coefficient  in  the  mtiltichannel  Schur  algorithm 
[15]. 


B.  CommenU 

We  comment  on  some  distinctions  between  equations  (15)  and  (9): 

1.  The  most  interesting  eispect  of  (15)  is  that  in  the  TWO- dimensional  wavelet  transform 

domain,  the  coupled  wave  system  (4)  (continuous  in  both  the  independent  variables 
X  and  t)  is  AGAIN  broken  up  into  a  set  of  coupled  discrete-space  discrete-time  wave 
systems.  The  wave  speed  of  the  discrete  wave  system  at  resolutions  mi  (for  x)  and 
m2  (for  t)  is  this  explains  the  causality  relation  noted  above.  This  is  qmte 

remarkable,  considering  the  nonlinearity  of  the  problem  in  the  variable  x; 

2.  In  terms  of  fast  algorithms,  (15)  describes  a  coupled  miiltichannel  lattice  structure, 
with  coupling  between  different  scales  specified  by  the  summation  over  ms,  amd  the 
couphng  coefficient  computed  from  i2(*)  and  C{-)  by  the  summations  over  m3  and  ns. 


The  summation  over  ns  has  limited  range  (see  (14)),  and  corresponds  to  mixing  of 
the  scales  in  the  interval  (x,  x  +  A); 

3.  Initialization  of  (15)  requires  knowledge  of  D(mi , nj , m2, n2)  and  17(mi, ni, m2, n2) 
for  0  <  ni  <  (2’”i  -  1)A.  This  corresponds  to  knowledge  of  d{x,t)  and  u(x,t)  for 
0  <  X  <  A.  Recall  that  the  discretized  Schur  algorithm  (4)  physically  reconstructs  a 
discrete  layered  medium  consisting  of  homogeneous  layers  of  thicknesses  A;  thus  we 
have  d{x,t)  =  k{i  -  x)  and  u(x,t)  =  k(t  +  x)  for  0  <  x  <  A.  Equations  (15)  can 
similarly  be  initialized  if  we  assume  a  homogeneous  medium  only  for  0  <  x  <  A 

4.  In  propagating  (15)  and  (16),  note  that  fractional  values  of  the  translation  index  n 
are  required  for  scales  m  such  that  2’”A  is  not  an  integer  (again  it  is  suggested  that 
A  =  2”^  for  some  integer  M).  This  has  the  same  interpretation  as  in  the  time-only 
wavelet  transform  algorithm  (see  Section  III),  although  now  this  applies  for  both  space 
and  time; 

5.  However,  fractional  values  of  n  create  a  problem  in  propagating  the  coupled  equations 
(15),  since  the  nonzero  values  of  the  coupling  function  C{-)  may  require  not-yet- 
computed  fimction  updates.  To  see  this,  consider  two  updates 

T>(mi,ni,m2,n2)  — >  D{mi,ni  -f  2^’”*~^\m2,n2  -f  2^"*'  ^^) 

Z?(m3,n3,m4,n4)  — +  D{m3,n3  -h  2^”**”^^ m4, n4  -f  2^"*®  ^^), 

where  mi  <  M.  From  (14),  it  is  possible  for  D{rnz^nz,m^yn^)  to  be  coupled  to 
D(mi,n3  -f-  I,m2,n2),  which  has  not  been  computed  yet  since  <  1. 

C.  Example:  Haar  Basis  Function 

To  illustrate  better  the  above  points,  and  the  operation  of  (15)  and  (16),  we  now 
specialize  to  the  Haar  wavelet  basis  function  shown  in  Figure  3.  This  illustrates  typical 
simplifications  that  occur  when  (14)  and  (15)  are  specialized  to  a  specific  basis  function, 
Eind  restilts  in  simpler  equations. 

First,  we  need  to  compute  the  function  C{mi,ni ,  m2,  n2,  m3,  na).  Again,  without  loss 
of  generality  we  assume  mi  ^  m2  ^  m3.  It  is  straightforward  to  show  that  for  the  Haar 


# 


# 
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wavelet  basis  function  we  have 


C(mi,ni,m2,n2,m3,n3) 


_  if  m\  <  m2  =  m3 

and  <n2=nz<  +  1/2)  -  1; 

=  if  mi  <  m2  =  m3  (17) 

and  +  1/2)  <  n2  =  n3  <  +  1)  “  1; 

=  0  otherwise. 

Note  in  particular  that  if  mi  ^  m2  m3  then  C(-)  =  0,  and  if  mi  =  m2  =  m3  then 

C(-)  =  0.  Equations  (15)  then  simplify  to 


D(mi,ni  +  2"*^  A,m2,n2  +  2’"*  A)  =  Z)(mi,ni,m2,n2) 


00  2('"3-mi)(„j+l)_l 

-  E  E  Ai?(MAX[mi ,  m3],  n)C7(m3,  n3,  m2, 112) 

"3=0  n3=2("a“'”i)ni 

m3?fcmi 


mi-l 

2*"*®/^Ai2(m3,n3  =  [2^'”®“”‘‘^ni])jC/(mi,ni,m2,n2);  (18a) 

m3=0 

U{mi,ni  +  2"’^A,m2,n2  -  2'”* A)  =  U{mi,ni^m2,n2) 

00  2('"3-"’i)(ni+l)-l 

-  E  E  2±MlN[mi ^  m3],  n)Z?(m3,  ns ,  m2,  n2) 

"3=0  n3=2("3-"i)ni 

m3?cmx 


mj  — 1 

-(  S  2=^”‘*^^Ai2(m3,n3  =  [2("’®-'"')ni 

m8=0 


]))D(mi,ni,m2,n2), 


(186) 


where: 

1.  n  =  ni  if  mi  >  m3  and  n  =  na  if  mi  <  m3  in  i2(M AX'[mi,m3],n); 

2.  The  sign  in  is  chosen  according  to  the  value  of  nz  (see  (17)); 

3.  For  m3  <  mi  the  sum  over  na  becomes  a  single  value  na  =  [2^"**'"”*^^ni],  where  [•] 
denotes  the  greatest  integer  function. 

Note  the  simplicity  of  the  coupling  between  scales  in  (18),  and  how  the  coupling 
between  a  scede  and  itself  is  more  complex.  Also  note  that  the  coupling  &om  a  coarser 
scale  (smaller  mi)  to  a  finer  scale  (larger  mi)  varies  exponentially  with  the  smaller  sciJe. 
This  suggests  that  one  possible  way  of  implementing  (18)  is  to  simply  neglect  the  coupling 


from  scales  with  m  <  M  (where  A  =  2"^)  to  scales  with  m  >  M ,  since  the  coaxser  a  scale 
is,  the  smaller  its  couphng  to  finer  scales  is  (note  that  if  m  <  M,  then  2^”  A  =  <  !)• 

After  propagating  the  finer  scales  with  m  >  M,  and  having  computing  the  values  needed 
to  propagate  the  coaxser  scales  with  m  <  M,  these  coaxser  scales  can  then  be  propagated. 

D.  The  Bom  Approximation 

The  Born  approximation  is  a  linearization  of  the  inverse  scattering  problem;  the  idea 
is  to  make  the  reflectivity  fimction  r{x)  be  linearly  related  to  the  reflection  response  k(x). 
One  way  to  do  this  is  to  scale  r(x)  by  a  small  parameter  e,  expand  d(x,t)  and  u{x,t)  in 
Taylor  series  in  e,  and  discard  all  terms  of  order  or  smaller.  This  eliminates  the  nonhneax 
terms  r(x)d(x^t)  and  r{x)u{x,t)  in  (1).  The  result  is 

d(x,t)  =  0  — »  d{x,t)  =  d{x  —  t)  (19a) 

u(x,t)  =  0 u{x,t)  =  u{x +  t)  (196) 

and  these  in  conjunction  with  (11)  immediately  yield 

r(x)  =  2«(x,x)  =  2u(0, 2x)  =  2k(2x).  (20) 

Equation  (20)  has  a  nice  physical  interpretation:  the  reflection  response  is  due  entirely  to 
direct  reflections  of  the  probing  impulse  by  the  reflectivity  function  of  the  medium.  Note 
that  multiple  scattering,  which  is  inherently  nonlinear,  is  neglected;  for  this  reason  the 
Born  approximation  is  often  called  a  single-scattering  approximation. 

Taking  the  wavelet  transform  (8)  of  (20)  gives  the  very  simple  result 

R{m,  n)  =  y/2K{Tn  —  1,  n).  (21) 

In  the  Bom  approximation,  the  wavelet  transform  of  the  reflectivity  function  equals  the 
wavelet  transform  of  the  reflection  response,  on  a  scale  one  octave  coarser.  This  is  due 
to  two-way  traveltime-the  reflectivity  function  r(x)  is  imaged  at  time  i  —  2x  since  the 
impulse  must  travel  down  to  depth  x  and  then  back  to  the  surface  to  be  measured  as  k{x). 


V.  WAVELET  TRANSFORM  IN  SPACE  AND  TIME-INTEGRAL  EQUATION 
A.  Derivation  of  Discrete  Linear  System  of  Equations 


To  obtain  an  alternative  to  the  two-dimensional  wavelet  transform  domain  layer  strip¬ 
ping  algorithm  of  Section  IV,  we  now  apply  the  two-dimensional  wavelet  transform  to  the 
Krein  integral  equation  (5).  The  result  is  a  linear  system  of  equations  with  block-slanted- 
Toeplitz  structure  (defined  below),  which  relates  r(x)  at  different  scales  to  k{x)  at  different 
scales.  Unlike  the  algorithm  of  Section  4,  no  approximations  are  necessary  to  use  this  re¬ 
sult. 

First,  we  make  a  minor  change  in  the  Krein  integral  equation  (5).  Define 

k'(x)  =  2k{2xy,  h'(x,t)  =  2h(x,2t-xy,  t'^ix  +  t)/2-,  z'  =  {x  +  z)/2.  (22) 


Then  (5)  (multiplied  by  2)  and  (6)  can  be  rewritten  as 

k'ix-t')  =  h'{x,t')+  r  h'(x,z')k'{\z'  -t'\)dz'-,  0<t'<x;  0<X<1 

Jo 

r{x)  =  h'(x,0y 


(23) 

(24) 


respectively.  This  merely  changes  the  interval  —x  <  t  <  x  to  the  interval  0  <  t'  <  x,  for 
both  the  range  of  integration  and  the  range  of  validity  of  the  integral  equation.  We  further 
alter  the  former  to  — oo  <  t  <  oo  by  simply  defining  h'{x,t)  =  0  for  <  <  0  or  t  >  x;  this 
wiU  prove  useful  below. 

Now  let  the  wavelet  basis  function  ^(x)  have  the  properties  listed  at  the  beginning  of 
Section  4.  Define  the  following  two-dimensional  wavelet  expansions: 


ff'(mi,ni,m2,n2) 


if ' (m  1 ,  n  1 ,  m2 ,  n2  )2’"" ^(2 


X  -  ni)2’"*/^^(2”*’f  -  n2)dxdt\ 

"*^x-ni)2"*»/V(2’"’<-n2);  (25a) 


mi  ni  mj  nj 


K'{mz,ni)  =  f  A:'(x)2"‘*/^^(2”‘*x-n3)dx; 
J  — OO 


jb'(x)  =  J]iiC'(m3,n3)2”*»/2^(2’"»x-n3). 

ms  ns 

(256) 
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Also  define  the  discrete  function  (note  the  difference  from  (13)) 


Eirtix  >  5  n^2  ?  ^2  ?  m.3  ^  TI3 ) 


/oo  ^00 
-00  J  —00 

Defining  x'  =■  x  —  2“*^ 


"*‘1  -  ni)2'^^/^<i>{2^^y  -  n2)2"‘«/V(2"*"(x  -  y)  -  ns)  <ix  ciy. 
'^ni  and  y'  =  y  —  2“’”^n2,  we  can  rewrite  (26)  as 


(26) 


£'(mi,ni,m2,n2,m3,n3)  = 

2{m,+m,+ms)/2  f°°  <^(2"*ia:')^(2”*^y')<^(2'”«(x' -  y')  “  n3  +  2'”*(2-'"^ni  -  2-'"^n2)) 

J—oo  J —oo 

=  E(mi,m2,m3,n3,(2“”*‘ni  -  2~’"*n2)).  (27) 

Unlike  C(mi,ni,m2,n2,m3,n3),  i:(mi, m2,m3,  n3, (2""** rn  -  2-"‘*n2))  is  not  symmetric 
with  respect  to  permutations  of  the  subscripts  {1, 2, 3).  However,  it  does  have  the  slanted- 
Toepliiz  structure^  defined  as  being  a  function  not  of  ni  and  n2  separately,  but  a  function 
of  only  their  weighted  difference  2~'”'ni  —  2  '”*n2. 


B.  Linear  System  of  Equations 

Now  insert  the  wavelet  expansions  in  (25)  into  the  modified  Krein  integral  equation 
(23),  and  apply  the  wavelet  transform  (8a)  twice,  first  in  x,  and  then  with  x  replaced  by 
t,  to  the  result.  Using  (25)-(27),  this  yields 

ii:"(mi,m2,(2“’"‘ni  -2“"**n2))  =  H'{mi,ni,m2,n2) 

M 

+  ^2  ^•ff'(»7ii,ni,m3,n3)i^"(m3,m2,(2“’"®n3  -  2“”‘*n2)); 
m8=0  ns 

0<ni<2’"^-l;  0  <  2""*»n2  <  2"’”‘ni  -  1;  0<mi,m2<M,  (28) 

where  K"{’)  is  defined  from  (25b)  and  (27)  as 

fr:"(mi,m2,(2-”*^ni  -2-"‘»n2)) 

=  Y2  ^2  ^(”^i»’’^2,^3,n3,(2””**ni  —  2“"**n2))A''(m3,n3),  (29) 
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and  M  is  the  (arbitrarily  large)  finest  scale  used  in  the  wavelet  expansion.  The  summation 
over  ns  in  (28)  is  over  all  integer  values  of  ns  such  that 


0  <  2"’”®ns  <  2"’"‘ni  -  1. 


(30) 


The  system  matrix  specified  by  K"(-)  in  the  system  of  equations  (28)  has  a  block- 
slanted-Toeplitz  structure,  in  that  in  the  (mi,m2)*^  block  the  elements  (ni,n2)  are  equal 
for  constant  values  of  2~”*^ni  —  2“"*’n2,  i.e.,  along  diagonals  of  slope  Note 

that  matrix  coordinates  start  at  (0,0)  here,  not  (1, 1).  In  fact,  the  system  matrix  has  the 
following  structure  (letters  denote  equal  entries,  excluding  symmetry): 


[♦ 

a 

* 

b 

/ 

* 

♦ 


* 

a 

* 

* 

b 

/J 


[* 
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* 

■  c 

d 

e 

,  * 


♦ 

/ 

♦ 

d 

c 

d 

e 


* 

♦ 

b 

e 

d 
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*] 

* 

/. 

♦  ' 

e 

d 

C  J 


(31) 


C.  Comments 

1.  To  reconstruct  the  wavelet  transform  R{m,n)  of  r(x)  from  the  wavelet  transform 
K{m,n)  of  k(x),  simply  do  the  following: 

a.  Compute  K"{-)  from  k(x)  using  (22),  (25b),  and  (29); 

b.  Solve  the  linear  system  of  equations  (28)  for 

c.  Compute  .R(m,n)  from  H'{-)  using 

M 

i2(mi,ni)=  ^  fr'(mi,ni,m2,n2  =  0)^(0''')  (32) 

mj=0 

which  is  derived  by  applying  the  wavelet  transform  (8a)  to  (24)  while  setting 
t  =  0  in  the  wavelet  expemsion  (25a).  The  system  of  equations  (28),  along  with 
(29)  and  (32),  thus  relates  the  wavelet  transforms  K{m,n)  and  i?(m,  n); 

2.  The  limits  on  the  indices  in  (28)  follow  directly  from  the  limits  0<x<l,  0<<<a: 
in  (23),  and  from  the  finite  support  of  <f>{x)  on  0  <  x  <  1.  Equation  (30)  follows  from 
the  same  reasoning  used  in  Section  IV ; 


3.  Unlike  the  algorithms  of  Sections  III  and  IV,  there  are  no  fractional  values  of  the  n,, 
since  there  is  no  A  parameter; 

4.  k(\z  -  t)\  is  under  the  integral  in  (23),  and  k{x  -  t)  on  the  left  side  is  only  used  for 
x  —  t  >0.  Since  both  A:(i)  and  ^(x)  axe  causal  functions,  we  have  fc(|x|)  =  k{x)  +  k{—x), 
and  similarly  for  4){x).  As  a  result,  (x  -  y)  in  (26)  and  (27)  should  be  replaced  by 
|x  -  y|,  so  that  E{-)  is  sjonmetric  with  respect  to  permutations  of  the  indices  {1,2}. 
This  means  the  system  of  equations  (28)  is  symmetric,  as  is  the  integral  equation  (23); 

5.  If  the  Krein  integral  equation  (23)  is  discretized  with  A  =  2“^,  then  the  resulting 
system  of  equations  has  size  2^.  If  the  finest  scale  used  in  (28)  is  M,  then  (31)  shows 
that  the  size  of  the  system  (28)  is 

1 +  2  +  4  +  8  +  ... +  2"  =  2^+^ -1  (33) 

so  (28)  is  only  twice  as  large  as  a  simple  discretization  of  (23); 

6.  The  kernel  K"{’)  (28)  has  the  displacement  structure 

K"{mx,ni  +2'"'A,m2,n2  +  2”‘’A)  -  A'"(mi,ni,m2,n2)  =  0;  (34) 

this  is  clearly  associated  with  the  coupled  wave  systems  (15),  just  as  the  Toeplitz 
structure  in  the  discretized  Krein  integral  equation  is  associated  with  discrete  trans¬ 
mission  fines  [12].  The  only  difference  is  the  differing  wave  speeds  in  different  wave 
systems.  However; 

7.  The  form  of  (31)  maJces  it  clear  why  the  multichamiel  coupled  lattice  structure  (15) 
derived  in  Section  IV  requires  coupling  to  not-yet-computed  quantities  at  different 
scales.  The  block-slanted- Toeplitz  property  imposes  more  constraints  on  blocks  near 
the  main  block-diagonal,  while  blocks  far  away  from  the  main  block-diagonal  have 
less  structure.  In  short,  although  there  is  some  structme  in  (31)  (enough  to  associate 
lattice  equations  (15)  with  it),  there  is  noi  enough  structure  to  edlow  the  existence  of 
a  fast  algorithm! 


VI.  CONCLUSION 

The  wavelet  transform  has  been  used  to  generate  a  multiresolution  analysis  of  the 
exact  (all  multiple  reflections  included)  one- dimensional  inverse  scattering  problem.  Three 
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new  wavelet  transform  domain  algorithms  have  been  derived  for  this  problem.  The  first 
operates  in  the  one-dimensional  (time-only)  wavelet  transform  domain.  It  consists  of  a 
decoupled  set  of  discrete  wave  equations,  each  with  a  different  wave  speed  determined  by 
the  wavelet  transform  resolution.  Any  one  scale  is  sufficient  to  reconstruct  the  reflectivity 
function;  this  permits  some  flexibihty  (e.g.,  using  several  scales  for  noise  reduction)  in 
implementing  this  algorithm. 

The  second  and  third  algorithms  operate  in  the  two-dimensional  (time  and  space) 
wavelet  transform  domadn.  One  consists  of  a  coupled  multichannel  set  of  discrete  wave 
equations;  however,  an  approximation  of  neglecting  some  coupling  between  scales  is  re¬ 
quired  to  use  it.  An  example  using  the  Haar  wavelet  basis  function  suggests  this  is  not 
unreasonable.  The  other  algorithm  consists  of  solving  a  block-slsinted-Toeplitz  linear  sys¬ 
tem  of  equations,  which  specifies  directly  the  relation  between  the  wavelet  transforms  of 
the  reflectivity  function  and  the  impulse  reflection  response.  The  structure  in  this  system 
of  equations  explains  the  form  of  the  coupled  multichannel  set  of  discrete  wave  equations 
derived  earlier.  Fineilly,  we  note  in  passing  that  these  results  can  be  applied  to  the  linear 
prediction  problem  of  computing  the  linear  least-squares  estimation  filter  of  a  zero-mean 
wide-sense-stationary  random  process  from  its  covarieince  function,  since  the  Krein  integral 
equation  has  the  same  form  as  a  Wiener-Hopf  integral  equation. 
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FIGURE  HEADINGS 

1.  A  differential  section  of  the  scattering  medium  to  be  reconstructed. 

2.  The  sampling  grid  (scale  vs.  translation)  for  the  wavelet  transform. 

3.  The  Haar  wavelet  basis  function. 


APPENDIX  D 

A.E.  Yagle,  “Inversion  of  Spherical  Means  Using  Geometric  Inversion  and 
Radon  Transforms,”  Inverse  Problems  8(6),  949-964,  December  1992. 

This  paper  analyzes  the  problem  of  reconstructing  a  function  from  its  spherical  means 
passing  through  the  origin.  A  new  application  of  this  problem  to  diffraction  tomography 
is  noted:  We  show  that  given  probing  by  impulsive  plane  waves  at  all  angles  of  incidence, 
only  a  single  receiving  sensor  is  necessary,  not  an  array  of  sensors. 

Two  versions  of  the  problem  are  defined.  A  layer-stripping-type  algorithm  is  derived 
for  one  version  (we  use  the  term  invariant  imbedding  in  the  paper,  since  it  is  more  familiar 
to  readers,  but  it  is  really  layer  stripping).  The  two  versions  are  shown  to  be  equivalent  to 
the  usual  and  exterior  inverse  Radon  transforms,  respectively,  using  geometric  inversion 
(reflection  about  a  circle).  A  simple  numerical  example  is  also  included. 
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APPENDIX  El 

J.  Frolik  and  A.E.  Yagle,  “Forward  and  Inverse  Scattering  for  Discrete 
Layered  Lossy  and  Absorbing  Media,”  submitted  to  SIAM  J.  Appl.  Math, 
June  1993. 

A  complete  theory  for  the  1-D  forward  and  inverse  scattering  problems  for  discrete 
layered  lossy  media  is  presented.  By  a  “complete  theory,”  we  mean  systems  of  equations 
that  are  discrete  counterparts  to  integral  equations,  and  discrete  fast  algorithms  that 
solve  these  systems  of  equations.  Applications  to  discrete  lossy  transmission  lines,  and 
to  electromagnetic  wave  propagation  in  absorbing  dielectrics,  are  made,  and  numerical 
examples  presented. 


FORWARD  AND  INVERSE  SCATTERING  FOR 
DISCRETE  LAYERED  LOSSY  AND  ABSORBING  MEDIA 

Jeffrey  L.  Frolik  and  Andrew  E.  Yagle 
Dept,  of  Electrical  Engineering  and  Computer  Science 
The  University  of  Michigan,  Ann  Arbor,  MI  48109-2122 

May  1993 

Abstract.  A  complete  digital  signal  processing  (DSP)  theory  is  developed  for  forward  and  inverse  scattering  in 
discrete  (piecewise-constant)  layered  lossy  systems,  generalizing  previous  work  for  discrete  lossless  systems  and 
continuous  lossy  systems.  This  work  is  motivated  by  radar  reflections  from  stratified  dielectrics  and  interchip 
communication  modeled  by  lossy  transmission  lines.  The  DSP  formulation  allows  exact  solutions,  without 
discretization  approximations,  and  includes  all  multiple  reflections,  transmission  scattering  losses,  and  absorption 
effects.  For  the  forward  problem,  discrete  matrix  Green's  functions  are  derived  for  the  lossy  medium,  as  are  fast 
algorithms  for  computing  impulse  reflection  and  transmission  responses  for  the  medium.  For  the  inverse  problem, 
asymmetric  Toeplitz  systems  of  equations  which  function  as  discrete  counterparts  to  integral  equations  are  derived,  as 
are  fast  algorithms  for  reconstructing  the  medium  from  its  impulse  transmission  and  reflection  responses.  Data 
sufficiency  and  feasibility  are  discussed.  Finally,  these  results  are  applied  to  the  LCRG  transmission  line  and  layered 
dielectric  medium  problems,  dispersion  effects  are  discussed,  and  a  numerical  example  is  presented. 

Keywords.  Inverse  scattering,  lossy  media,  layered  dielectrics,  lossy  transmission  lines. 


AMS  (MOS)  subject  classirication.  34A55,  35P25,  35R30,  39A12 


1.  Introduction.  Inverse  scattering  problems  have  application  in  numerous  fields, 
including  mathematics,  geophysics,  physics,  and  electrical  engineering.  The  study  of  inverse 
scattering  for  electrical  engineers  is  motivated  by  the  mathematical  similarity  of  problems  such  as 
the  design  of  digital  filters  in  cascade  form,  the  derivation  of  fast  lattice-form  linear  least-squares 
prediction  error  filters,  synthesis  of  transmission  lines,  and  wave  propagation  in  layered  dielectric 
media,  etc.  In  each  case,  the  goal  is  to  compute  system  parameters  (e.g.,  PARCOR  coefficients, 
dielectric  constants,  wave  speeds,  etc.)  from  the  response  to  an  impulsive  excitation  measured  at 
the  system  boundaries.  This  response  is  known  as  the  scattering  data. 

Since  many  of  these  systems  are  discrete  in  nature  (e.g.,  digital  filters,  discrete  layered 
dielectric  media,  etc.),  they  can  be  modeled  as  discrete-time  systems.  The  discrete  version  of  the 
inverse  scattering  problem  has  been  applied  to  lossless  transmission  lines  [1],  Schrodinger's 
equation  [2],  elastic  waves  in  layered  media  [3],  and  electromagnetic  scattering  [4],  However,  all 
of  these  were  lossless  inverse  scattering  problems. 

In  this  paper,  we  will  extend  previous  work  [1],  which  has  dealt  entirely  with  discrete 
lossless  systems,  to  the  case  of  discrete  lossy  or  absorbing  systems.  In  particular,  we  will  discuss 
and  link  the  following  topics:  (1)  scattering  (forward  problem);  (2)  reconstruction  (inverse 
problem);  (3)  discrete  matrix  Green's  functions;  (4)  discrete  asymmetric  Toeplitz  systems;  (5) 
asymmetric  Levinson  and  Schur  algorithms;  and  (6)  applications  to  lossy  transmission  lines  and 
electromagnetic  wave  propagation  in  layered  dielectrics. 

The  advantages  of  the  discrete  formulation  over  the  continuous  formulation  used  in  [5]  and 
[6]  are  three-fold.  First,  most  practical  problems  of  interest  are  in  fact  best  modeled  as  discrete 
(piecewise-constant),  rather  than  continuous,  systems.  The  present  work  was  motivated  by 
radioglaciology  (radar  propagation  through  glaciers  [7]);  other  applications  include  radar  reflections 
from  stratified  dielectrics  [8],  interchip  communication  modeled  by  lossy  transmission  lines  [9], 
and  design  of  optical  waveguide  gratings  [10].  Second,  our  digital  signal  processing  (DSP) 
formulation  explicitly  shows  the  connection  between  scattering  from  a  discrete  lossy  system  and 
using  the  Levinson  and  Schur  algorithms  for  asymmetric  Toeplitz  systems  of  equations.  This 


generalizes  previous  work  [1]  connecting  scattering  from  a  lossless  medium  and  the  Levinson  and 
Schur  algorithms  for  symmetric  Toeplitz  systems  of  equations.  It  also  permits  the  use  of  such 
familiar  DSP  tools  as  the  z-transform  and  discrete  Fourier  transform,  which  can  be  computed 
easily  and  exactly,  rather  than  the  continuous  Fourier  transform,  which  must  always  be 
approximated  and  which  may  introduce  aliasing.  Finally,  important  issues  such  as  transmission 
losses  and  dispersion  appear  in  the  discrete  formulation  that  do  not  appear  in  the  continuous 
formulation  of  [5]  and  [6].  Since  continuous  equations  are  discretized  in  [5]  and  [6],  these  effects 
are  neglected  there.  This  makes  our  explicitly  discrete  approach  more  realistic,  both 
computationally  and  in  modeling. 

This  paper  is  organized  as  follows.  In  Section  2  we  define  the  asymmetric  two-component 
wave  system  that  models  wave  propagation  in  lossy  media  (as  will  be  shown  in  Sections  5  and  6). 
Section  3  defines  the  forward  scattering  problem,  various  types  of  scattering  data,  and  derives  the 
discrete  matrix  Green's  function  for  the  medium.  This  can  be  used  to  generate  scattering  data  from 
medium  parameters.  We  also  discuss  sufficiency  and  feasibility  of  scattering  data.  Section  4 
defines  the  inverse  scattering  problem,  and  shows  how  it  may  be  solved  either  by  solving  an 
asymmetric  Toeplitz  system  of  equations,  which  functions  as  a  discrete  counterpart  to  an  integral 
equation,  or  by  running  the  asymmetric  Schur  algorithm.  Section  5  shows  how  a  discrete  lossy 
(LCRG)  transmission  line  can  be  formulated  as  in  Section  2.  Section  5  also  discusses  the 
significance  of  dispersion,  relative  to  absorption.  Section  6  shows  how  electromagnetic  wave 
propagation  in  a  layered  dielectric  can  be  formulated  as  in  Section  2.  Although  there  have  been 
many  matrix  formulations  of  this  problem,  ours  is  the  first  that  can  be  linked  directly  to  asymmetric 
Toeplitz  systems  and  already-existing  fast  algorithms.  Section  7  presents  a  illustrative  numerical 
example  demonstrating  the  reconstruction  of  a  lossy  layered  dielectric.  Section  8  concludes  with  a 


summary. 


# 


2.  System  Definitions. 

2.1.  Discrete  Systems.  A  discrete  system  is  defined  to  consist  of  homogeneous 
segments  (or  layers),  in  each  of  which  the  characterizing  parameters  (e.g.,  impedance,  wave 
speed,  etc.)  do  not  change.  An  example  of  a  discrete  system,  a  transmission  line  in  which  x  is 
horizontal  position  and  Z  is  characteristic  impedance,  is  shown  in  Fig.  1. 


I  ^1  1  ^2  I  ^3 


jTq  =  0  j:i  jCj  j:3 

Fig.  1  Impedance  profile  for  a  discrete  system 
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We  assume  that  the  travel  time  through  a  homogeneous  layer  is  an  integer  multiple  of  a  small  time  ^ 

increment  A.  For  example,  if  wave  speed  is  constant  in  Fig.  1,  then  we  assume  A,  =  n,A  for 

some  integer  n„.  Note  that :  (1)  homogeneous  layers  of  varying  thicknesses  are  effectively  being 

modeled  as  stacks  of  thin,  identical  layers;  (2)  this  situation  can  be  realized  by  integrating  and  then  ^ 

sampling  continuous-time  data;  and  (3)  this  assumption  is  also  required  in  [1].  Since  A  is  merely 

a  scaling  factor,  without  loss  of  generality  we  set  A  =  1  for  convenience. 


2.2.  Discrete  Asymmetric  Two-Component  Wave  Systems.  A  segment  of  a 
discrete  lossy  system  can  be  described  by  the  asymmetric  two-component  wave  system  shown  in 
Fig.  2.  # 


4 


DS^)- 


UAz) 


D,Az) 


Fig.  2  Transmission  diagram  for  a  discrete  lossy  system 

D^iz)  and  U^{z)  are  the  z-transforms  of  the  Downgoing  and  Upgoing  waves,  respectively,  and 

are  defined  just  to  the  right  of  the  interface  between  the  (n-iy‘  and  n"^  segments.  The  z- 
transform  of  a  discrete-time  sequence  {...x_2,x_],Xo,x+,,x^2...}  is  defined  as 
XCz)  =  Z{jc(n)}  =  The  relationship  between  {D„(z),  f/,(z)}  and  (D,^,(z), 

(z) }  is  characterized  by  the  layer  propagation  matrix  {  (z) } 


rD,„(z)l_  .  jD.jz.)]  1  r  1  0  p,(z)' 

t/„,(z).  C/.(z)  I  1  0  z”. ^U).’ 


which  we  write  as  the  product  of  a  z-independent  transfer  matrix  { I,  )  having  transfer 
coefficients  {  r,  }  and  a  time-delay  matrix  { d>,  „^,(z) ),  so  that 


(2.2a) 


1  r 


(2.2b)  0_,(z)  = 


z""  0 

0  z"- 


The  lossy  model  differs  from  the  lossless  model  in  that  the  transfer  matrix  coefficients 
{  }  are  not  equal.  Note  that  =  -x/l  -  is  the  transmission  coefficient  for 

the  layer,  for  reasons  that  will  become  clear  in  Section  3.  The  model  (2.1)  is  motivated  by  the 


applications  discussed  in  Sections  5  and  6. 
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T 


D. 


U. 


1 


U,  ooj' 
Fig.  3  Discrete  multi-layer  model 


3.  Forward  Scattering. 

3.1.  Transition  Matrix.  For  the  discrete  multi-layer  medium  shown  in  Fig.  3,  the 
relationship  between  the  waves  in  any  two  segments  can  be  found  by  cascading  the  intervening 
layer  propagation  matrices  {F,  ,^i(z)}  defined  in  (2.1).  Specifically,  the  waves  {D,  =D,(z), 
U,  =  17, (z) }  in  the  last  layer  are  related  to  those  {  Dq  =  Dq{z),  Uq  =  Uq{z)  }  in  the  first  layer  by 


(3.1) 


D,{z) 
17,  (z) 


=nF.-. 


'D,iz) 

U,{z) 


M,(z) 


'D,{z) 

U,{z) 


where  the  product  of  the  intervening  layer  matrices  {  ,^i(z) )  is  the  transition  matrix  {  M,(z) } 

for  the  total  medium,  which  is  a  2x2  matrix  of  polynomials 


(3.2) 


M,(z) 


F..(z)  F,,(z) 

F^fiz)  F^{z) 


Note  that  M,(z)  acts  as  the  z-transform  of  a  discrete  Green's  function  for  the  entire  medium,  in 

that  it  relates  the  waves  at  one  end  to  the  waves  at  the  other  end.  As  an  aside,  note  that  for  lossless 
media  we  have  and  the  elements  of  the  transition  matrix  can  easily  be  shown  to 

satisfy  the  following  relationship  [1]: 


9 


# 


6 


3.2.  Reflection  and  Transmission  Responses.  We  define  tht  forward  reflection 
[  R^iz)}  and  transmission  {T^(z)}  responses  as  the  responses  of  the  layered  medium  to  a 
downgoing  impulse  at  the  top  layer  (see  (3.4a)).  Likewise,  we  define  the  backward  reflection 
{  ^‘’(z)}  and  transmission  [T*’{z)]  responses  as  the  responses  to  an  upgoing  impulse  at  the 
bottom  layer  (see  (3.4b)). 


(3.4a) 


T^{z) 

0 


M,(z) 


R^{z) 


(3.4b) 


R\z) 

1 


M,(z) 


T\z) 


R^ {z),  T^(z),  R'’(z),  and  r*(z)  are  known  collectively  as  the  half-space  impulse  responses. 
They  are  illustrated  in  Fig.  4  (note  1  =  Z{5(n)}). 


Forward 

Response 


Backward 

Response 


t^) 


Fig.  4  Half -space  impulse  responses 


It  can  be  shown  that  the  forward  and  backward  transmission  responses  are  identical,  i.e.,  we  have 
the  reciprocity  relation  T^iz)  =  T^{z).  We  also  have 


(3.5) 


DET  {M,  (z)}  =  Fii  (z)F22(z)  -  f’,2(z)f’2i  (z)  =  1 . 


Note  that  (3.5)  follows  from  the  layer  matrices  {  F,  „^,(z) }  defined  in  (2.1)  being  unimodular,  and 
that  the  product  of  unimodular  matrices  is  itself  unimodular.  Recall  that  a  unimodular  matrix  has 
determinant  {  DET }  equal  to  unity,  with  entries  functions  of  some  variable,  e.g.,  z. 

The  outgoing  wave  components  {D,(z),  Uq{z)}  can  be  written  as  a  function  of  the 
incoming  wave  components  {  Dq{z),  U,{z)  }  using  the  scattering  matrix  {  S(z) } 


(3.6) 


’A(z)' 

’tAz) 

R'(Z)' 

'Do(z)- 

A(z). 

T\z)_ 

U,(z)_ 

=  S(z) 


'D.izY 

U,iz)_ 


and  Fig.  2,  which  depicted  the  wave  propagation  (2.1),  can  be  redrawn  as  the  scattering  diagram 
Fig.  5. 


Fig.  5  Scattering  diagram  for  discrete  lossy  system 


The  transition  matrix  formulation  has  the  advantage  that  the  wave  response  from  one  layer 
to  the  next  is  found  simply  by  multiplying  the  intervening  layer  propagation  matrix.  This  method 
does  not  hold  true  for  the  scattering  formulation;  however,  the  system  input/output  relationships 
are  best  described  using  the  scattering  formulation. 

The  reflection  and  transmission  responses  ^*(z)and  T*’{z)  can  easily  be  computed,  as  can 
be  seen  by  comparing  Fig.  3  and  Fig.  4.  Given  {  )  and  { },  compute  F,  ,^i(z)  for  each 

n  using  (2.1),  and  M,(z)  using  (3.1).  From  (3.2)  and  (3.4b),  it  can  be  seen  that 

(3.7a)  R\z)  =  ^^^-  (3.7b)  T\z)  =  —^. 

F22{z)  F^{z) 
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(z)  and  (z)  can  be  computed  similarly  from  the  elements  of  M,  ‘  (z) .  Computing  M/  (z)  in 


closed  form,  we  have 
(3.7c)  = 


(3.7d) 


7^(z)  =  F„(z)- 


Fi;(z)F3,(z) 

F„(z) 


3.3.  Adjoint  System.  In  previous  work  [1],  it  was  shown  that  when  the  system  is 
lossless,  and  therefore  the  transfer  coefficients  are  equal  {  r„  =  R„  „^^ },  the  reflection 

coefficients  { }  at  each  boundary  can  be  found  by  measuring  the  one-sided  impulse  reflection 
response  {  R^  (z) }  of  the  system.  On  the  other  hand,  when  the  system  is  lossy,  and  the  transfer 
coefficients  are  not  equal  {  r„  ^  s„  R,  „^, },  additional  information  is  clearly  needed. 

The  additional  information  can  be  found  in  the  impulse  reflection  response  of  the  adjoint 
system  defined  in  (3.8)  below.  The  adjoint  system  does  not  physically  exist,  and  thus  its  impulse 
response  cannot  be  directly  measured.  However,  we  will  show  shortly  that  by  measuring  the  two- 
sided  impulse  response  of  the  system,  the  adjoint  impulse  reflection  response  can  be  calculated. 

We  start  by  characterizing  the  adjoint  system  by  a  forward  layer  propagation  matrix 
{  Fn.n+i(z)}’ defined  as 


(3.8) 

/F,  ^ 

■  1 

z'”* 

0  ■ 

1 

0 

z"' 

and  a  transmission  diagram  (Fig.  6),  in  which  the  transfer  matrix  coefficients  {  ^n,n+l  ’  ^  ate 

transposed  from  the  actual  system  (see  (2.1)  and  Fig.  2). 


Fig.  6  Transmission  diagram  for  discrete  adjoint  system 


Note  for  the  lossless  system  =s„„^,,  in  which  case  the  actual  and  adjoint  systems  are 
identical. 

As  one  might  suspect,  there  is  a  relationship  between  the  scattering  responses  of  the  actual 
and  adjoint  systems.  We  begin  by  using  Fig.  5  to  write  the  layer  scattering  matrix  {  S„  „+;(z)  j  for 

the  actual  system: 


(3.9a) 


(z)  = 


-s 


-2n. 


(3.9b) 


s;U(z)=z 


■►2n. 


+s 


■Zn. 
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Next,  take  the  transpose  and  replace  z  with  z' ,  giving 


(3.10) 


-r 


-2n. 


n+l^*,«+l  ^ 


Note  that  (3.10)  is  equivalent  to  interchanging  the  transfer  coefficients  {  r,  ,  s,  )  in  (3.9a),  so  ^ 

that  [S;',^i(z-')f  is  the  scattering  matrix  for  one  layer  of  the  adjoint  system,  as  shown  in  Fig.  7. 


Fig.  7  Scattering  diagram  for  adjoint  system 


Using  a  similar  argument  on  (3.1)  shows  that  the  scattering  matrix  S(z)  for  the  entire  adjoint 
medium  is  related  to  the  scattering  matrix  S(z)for  the  entire  actual  medium  by: 
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The  continuous-medium  counterpart  to  (3.1 1),  with  Fourier  frequency  co  instead  of  z,  appeared  in 
[5].  However,  (3.1 1)  is  much  more  useful,  since  the  adjoint  system  responses  can  be  computed 
exactly  using  a  simple  inverse  z-transform,  or  the  discrete  Fourier  transform,  instead  of  an  inverse 
continuous  Fourier  transform,  which  can  never  be  computed  exactly  from  real  data. 

The  reciprocity  relationship  still  holds,  and  we  have  the  following  identity: 

(3.12)  7^(z)  =  f'’(z)=>D£r{F(z-')}=f',,(z')f-,3(z')-F,3(z')f,j(z')  =  l. 


The  transition  matrix  {  M,(z) }  for  the  adjoint  system,  which  acts  as  the  z-transform  of  the  discrete 

matrix  Green's  function  of  the  adjoint  medium,  can  now  be  found  from  the  transition  matrix 
{  M,(z) }  elements  of  the  actual  system: 


(3.13) 


(-1 


1=0 


Fu(^-') 


The  reflection  and  transmission  responses  of  the  adjoint  system  (defined  in  (3.11))  can  be 
computed  from  {  }  as  in  Section  3.3,  using  M,(z)  in  place  of  M,(z). 

3.4.  Data  Feasibility  and  Sufficiency. 

3.4A.  Data  Feasibility.  We  now  derive  another  new  relationship  between  the 
responses  of  the  actual  system  and  those  of  the  adjoint  system.  We  now  assume  that  both  actual 
and  adjoint  systems  are  both  probed  from  just  beneath  free  (perfectly  reflecting)  surfaces.  This  is 
approximately  the  case  if  there  is  a  huge  impedance  mismatch  at  the  interface  of  the  top  two  layers. 


such  as  a  ground-air  interface  or  an  conductor-dielectric  interface.  The  free-surface  system 
equations  for  the  actual  and  adjoint  system  are 


(3.14a) 


■T(z)- 

0 


M,(z) 


1  +  R(z) 
R(z) 


(3.14b) 


’t(z)' 

=  M,(z) 

'l  +  R(z)' 

0 

where  R(z)  and  R(z)  are  the  free-surface  impulse  reflection  responses  of  the  actual  and  adjoint 
media,  respectively,  and  D^{z)  =  Uq{z)  -f  1  states  that  the  free-surface  reflects  the  upgoing  wave  to 
become  the  downgoing  wave  ( 1  =  Z{5(t)}  is  the  probing  impulse). 

Equations  (3.5),  (3.12),  and  (3.14)  can  be  combined  to  show  that  the  cross-correlation  of 
the  actual  and  adjoint  medium  transmission  responses  is  the  superposition  of  the  actual  and  time- 
reversed  adjoint  system  free-surface  reflection  responses,  i.e., 

(3.15)  l  +  R(z)-^R(z-‘)  =  T(z)t(z-'). 


This  interesting  result  can  be  viewed  as  a  generalization  of  the  famous  Kunetz  equation  from 
lossless  to  lossy  systems.  Kunetz  [11]  showed  that  for  a  lossless  system,  the  free-surface 
reflection  response  is  one  side  of  the  autocorrelation  of  the  transmission  response,  i.e., 

(3.16)  l  +  R(z)  +  R(z'')  =  T(z)T(z'). 


The  relationship  between  (3.15)  and  (3.16)  can  be  easily  seen  by  recalling  that  for  a  lossless 
system  the  actual  and  adjoint  systems  are  identical.  Note  that  (3.16)  does  not  hold  for  lossy 
medium  responses. 

The  Kunetz  relation  (3.16)  shows  that  the  two-sided  free-surface  reflection  response  (the 
probing  impulse,  plus  the  reflection  response,  plus  its  time  reversal)  must  be  a  positive  definite 
(pd)  function,  i.e.,  positive  for  |z|=  1  (z  on  the  unit  circle).  This  an  important  check  on  the 


feasibility  of  the  reflection  data:  if  the  data  are  corrupted  by  noise  such  that  1  +  R(z)  +  R(z  ‘ )  is  no 
longer  pd,  the  data  are  infeasible,  i.e.,  there  is  no  medium  that  could  produce  this  reflection 
response.  This  is  why  layer  stripping  algorithms  have  the  reputation  of  being  unstable  in  noise; 
they  are  being  fed  infeasible  data!  If  the  noisy  data  are  altered  to  make  1  +  R(z)  +  R(z  ' )  pd.  the 
algorithms  are  stable. 

We  now  consider  the  possible  use  of  (3.15)  to  perform  similar  feasibility  tests  on  data  from 
lossy  systems.  Eq.  (3.15)  can  easily  be  symmetrized  into 


(3.17a) 

(3.17b) 

(3.17c) 


1  +  ^[r(z)  +  R(z-’ )  +  R(z)  +  R(z  ' )]  =  ^[T(z)t(z  ' )  +  T(z  ‘  )t(z)' 

=  ^[t(z)T(z‘  )  +  t(z)t(z' )  -  (T(z)  -  t(z))(T(z' )  -  t(z' )) 

'(T(z)  +  t(z))(T(z-‘ )  +  t(z-‘ ))  -  (t(z)  -  t(z))(T(z-' )  -  t(z-' )) 


It  is  clear  that  each  term  on  the  right  side  is  pd,  so  the  symmetrized  left  side  (compare  to 
(3.16))  is  pd  if  the  difference  of  the  pd  terms  on  the  right  side  is  pd.  This  will  happen  if  T(z) 
and  T(z)  are  not  too  dissimilar.  Specifically,  if  for  all  z  on  the  unit  circle  the  cosine  of  the  phase 
difference  between  T(z)  and  T(z)  is  positive,  then  the  left  side  must  be  pd.  If  the  former 
condition  were  known  to  be  true  a  priori  (e.g.,  the  medium  has  low  losses),  the  latter  condition 
could  be  used  as  a  feasibility  check  on  the  data  R(z)  and  R(z). 

A  very  important  issue  that  arises  when  the  reflection  data  are  noisy  is  distinguishing 
reflections  off  actual  interfaces  from  noise  spikes  in  the  reflection  data  that  could  be  interpreted 
incorrectly  as  interfaces.  Several  authors,  e.g.,  [12],  have  proposed  thresholding  the  reflection 
data,  i.e.,  a  data  value  above  the  threshold  is  interpreted  as  an  actual  reflection,  while  a  data  value 
below  the  threshold  is  regarded  as  noise  and  set  to  zero.  This  approach  has  proven  quite 
promising  for  lossless  media,  and  we  propose  its  use  for  lossy  media  as  well. 
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3.4B.  Data  Sufficiency.  In  the  next  section,  we  show  that  if  the  free-surface 
responses  {  R(z),  R(z) }  of  a  discrete  system  are  known,  fast  reconstruction  is  possible  using  the 
Levinson  or  Schur  algorithms.  We  now  show  that  it  is  straightforward  to  relate  the  actual  and 
did]omi  free-surface  responses  {  R(z),  R(z)}  and  the  measurable  half-space  impulse  responses 
{  R^z),  /?*(z),  T^(z),  and  T‘’(z)}.  The  significance  of  this  result  is  that  knowledge  of  the  four 
half-space  actual  system  responses  is  sufficient  to  reconstruct  the  system,  and  anything  less  is 
insufficient.  This  extends  a  result  of  [5]  to  discrete  lossy  systems. 

The  free-surface  response  {  R(z) }  for  the  actual  system  can  be  found  simply  from  the  half¬ 
space  forward  reflection  response  {  ^■'^(z) )  by 


(3.18) 


R(z) 


R^z) 

I-RHzV 


which  is  easily  understood  in  terms  of  feedback  caused  by  the  perfectly-reflecting  free  surface.  A 
relation  analogous  to  (3.18)  holds  for  the  adjoint  system  as  well.  However,  the  adjoint  half-space 
reflection  response  is  not  known.  Hence,  determining  the  free-surface  reflection  response  {  R(z) } 
of  the  adjoint  system  requires  knowledge  of  all  four  of  the  actual  system  half-space  impulse 
responses.  Using  (3.11)  gives 


(3.19) 


R(z)  = 


RHz)  _ -/?'’(z-‘) _ 

1  -  (z)  T"  (z'  )T^  (z-' )  R^  (z‘  )(l  -  (z-' ))  ■ 


Therefore,  even  though  the  responses  of  the  adjoint  system  are  not  directly  measurable,  we  can 
calculate  these  responses  from  the  forward  and  backward  (i.e.,  two-sided)  half-space  impulse 
responses  of  the  actual  system. 


3.5.  Discrete  Laver-Removal  Formulae.  We  now  show  that  the  free-surface 


responses  for  the  discrete  lossy  system  satisfy  a  recursive  relationship.  We  begin  by  rewriting 
(2.1)  as 


(3.20) 


f/„„(2)_z-»^„(z)-r,„,,2-"’D„(z) 
D,,,(z)  z-"'D,(z)-s_.z^"’(/„(z)’ 


and  define  the  reflection  response  at  the  top  of  the  n‘*'  layer  as  R„(z)  =  .  This  definition 

DM 

results  in  the  following  recursion  for  a  discrete  lossy  system: 


(3.21) 


z^^-R,(z)-r_, 
^"+1  ~  1 - : - - 


l-s.,..z-"-R„(z) 


The  recursion  is  initialized  at  the  surface  using  the  reflection  response  Rf{z).  Each  recursion 

removes  the  effect  of  one  layer  of  the  medium. 

The  lossless  version  of  this  equation  has  been  called  the  layer-removal  formula;  (3.21), 
which  is  new,  generalizes  the  formula  for  lossless  media  to  lossy  media.  Note  the  feedback  term 
in  the  denominator,  which  accounts  for  the  multiple  reflections. 

A  similar  equation  for  the  reflection  response  {  R,(z) )  of  the  adjoint  system  may  be  found 
by  swapping  the  transfer  matrix  coefficients  {  r,  s„  ),  yielding 


(3.22) 


Km) 


l-r_.z^^"’R,(z)’ 


which  is  initialized  using  Rf{z). 

Since  an  impulse  incident  at  the  top  of  a  discrete  system  requires  a  finite  travel  time  to  reach 
deeper  layers,  we  find  that  the  transfer  coefficients  {  }  may  be  obtained  from  the 

reflection  responses  {  R„(z),  R„(z)),  respectively,  using  the  initial  value  theorem: 


(3.23a) 


(3.23b) 
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This  constitutes  a  simple  solution  to  the  inverse  scattering  problem  defined  next  Note  that  (3.21) 
and  (3.22),  run  in  decreasing  n  and  initialized  with  zero,  can  also  be  used  to  solve  the  forward 
scattering  problem,  instead  of  (2.1),  (3.1),  (3.2),  and  (3.7). 


4.  Inverse  Scattering. 

4.1.  Discrete  Matrix  Green's  Function.  Label  the  elements  of  the  transition 
matrices  { M,(z),  M,(z) }  for  portions  of  the  actual  and  adjoint  systems  using  polynomials  G(z), 

H(z),  7(z),  and  ^^(z): 


(4.1a) 


(4.1b) 


M,(z) 


M,(z) 


■^u(z) 

1 

■  z-"-‘G,.i(z) 

-z^^-7,.,(z-^)‘ 

/21(Z) 

1 

_ 1 

N 

+ 

1 _ 

>22(Z^) 

1 

-z*^->//,_,(z-')' 

_-z^’-‘/,_i(z) 

z^^-G,.,(z-')  _ 

where  the  cumulative  delay  {  T,_i }  and  transmission  factor  {  cr, }  are  defined  by 


(4.2a) 


r-l 


“r-l 


In.; 


(4.2b) 


1=0 


t-1  (-1 

^1-1  ~  n^>.‘+l  “  11 ~  *'i,i+l®i.i+l  • 


1=0 


1=0 
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The  matrices  M„(z)  and  M„(z)  are  the  discrete  matrix  Green’s  functions  for  the  portion  of  the 
medium  above  (or  to  left  of)  the  n'*'  interface.  Note  that  M„(z)  =  F„  „^i(z)M„_i(z)  and 
M„(z)  =  F„  „^i(z)M„_i(z);  therefore, 


■z-^-G„(z)-z*^-/„(z-')‘ 

1  ^11,11+1 

z’”" 

0  ■ 

z‘^-‘G„_,(z)-z*^-'/„_i(z'') 

_z-^*//„(z)-z^^-7:„(z-^)_ 

1 

0 

z*^' 

z-^-'H„_,(z)-z^^-'K„_,iz^)_ 
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(4.3b) 


1 

^n,n+l 

-n. 

z 

0  ' 

z"'"-'/i:,_,(z)-z"^-'y/„_,(z'‘) 

_z^-y„(z)-z^^"G„(z')_ 

1 

_  0 

+  n_ 

z 

_z'"’-'  y„_,(z)-z"""-'G,.i(z'')_ 

Equations  (4.3a)  and  (4.3b)  are  initialized  by  noting  that  for  no  interfaces  { r  =  0,  T,_j  =  0 }  we 
have  M_t(z)  =  M_i(z)  =  I,  so  that  G_,(z)  =  ^_,(z)  =  1  and  //_,(z)  =  i_i(z)  =  0.  Note  for  a 
lossless  system  G{z)  =  K{z),  //(z)  =  7(z),  and  M,(z)  =  M,(z). 


4.2.  Derivation  of  Asymmetric  Toeplitz  System.  We  are  again  interested  in  the 
waves  generated  in  the  actual  and  adjoint  systems  due  to  a  downgoing  impulse  in  a  medium  having 
a  free-surface  upper  boundary.  Wave  components  at  the  interface  of  the  {n  - 1)'*  and  n‘^  layers  in 
the  actual  system  {  D^(z),  U^(z) }  and  adjoint  system  {  D^(z),  0^  (z) }  are  related  to  the  actual 
{  R(z) )  and  adjoint  {  R(z) }  free-surface  reflection  responses  using  the  discrete  matrix  Green's 
functions  (4.1): 

1  +  R(z)‘ 

R(z)  J’ 

1  +  R(z)' 

R(z) 


(4.4b) 


1 

-z^^-//„_,(z')1 

g:(z). 

_-z'^"-y„_i(z) 

z^^’-G,_.(z')  J 

(4.4a) 


'D^iz) 

1 

z'^'^‘G„.,(z) 

-z*^-A.,(z-') 

U^iz) 

-z’^-‘//,.,(z) 

Z^^- (z-')_ 

We  solve  for  R(z)  in  the  top  equation  of  (4.4a)  and  R(z  ')  in  the  bottom.  Similarly,  we  solve  for 
R(z)  in  the  top  equation  of  (4.4b)  and  R(z'*)  in  the  bottom.  This  yields 


(4.5a)  R(z)  =  . ■  (4.5b) 

z-^-'G„_,(z)-z^^’-'y,_,(z') 


(4.5c)  R(z)=  z  "-'K^_,{z) 

z-^'-'K^_,{z)-z^^'-'H^_,{z  ) 


R(z‘) 


a,_,(/:(z-’)  +  z--//„_,(z-’). 
z-^-'/^„.,(z)-z^^"-//,_.(z-')’ 


R(z')  = 


a„.,(7:(z--)  +  z^^-'y,_,(z-‘)_ 
z"’'-'G,_i(z)-z*''-'y„.i(z') 


Adding  (4.5a)  to  (4.5d)  and  (4.5b)  to  (4.5c)  (replacing  z  z'‘  in  the  latter)  yields: 


(4.6a)  (l  +  R(z)  +  R(z-‘  ))(z-'-  G..,  (z)  -  z"-''y..,(z-' ))  =  <T..,  (d.*(z)  +  (7*(z' )); 

(4.6b)  (l  +  R(z)+R(z'’))(z*’>'/:..,(z-‘)-z-'-'H._,(z))=a..,(D.'(z-')  +  t/.*(z)).  • 


Note  that  causality  holds  for  both  the  actual  and  adjoint  systems;  therefore  the  waves  D^{z), 
U^(z),  D^{z),  and  U^iz)  are  all  zero  in  the  time  domain  until  the  probing  impulse  arrives.  At 
the  moment  of  arrival  {  T,_, }  of  the  probing  impulse,  the  downgoing  waves  {  (z),  D*(z) }  are 

both  <T^_j,  and  the  upgoing  waves  { C/*(z),  0^(z) )  are  both  zero;  n„  seconds  later  the  upgoing 
waves  are  ^  and  ,,  respectively.  Using  these  facts,  and  equating  powers  of  z  in 

(4.6),  we  can  write  (4.6)  in  the  time-domain  as 


(4.7a) 


1  r(l)  r{2)  ■  •  r(n)' 

’  8.(0)- j,(n)  k^(n)-hSO)  ' 

O' 

r(l)  1  r(l) 

g,(i)-;»(n-i) 

0  • 

r(2)  r(l)  1  • 

= 

•  •  r{l) 

^,(l)-/t,(n-l) 

•  0 

_r(n)  •  •  •  f(l)  1  _ 

.  5„(n)-y),(0)  k^(0)-h^{T\)  _ 

.0 

(4.7b) 


"-1." 


i=0 


'n-l 


(4.7c) 


^  r(n  -H  -  /)(g,(0  -  A(n  -  Q) 


1=0 


'<1-1 


where  n  =  (recall  we  are  letting  A  =  1  without  loss  of  generality),  U,(z)  =  ^g„(/)  z‘  and 

1=0 

similarly  for  H^(z),  y,(z),and  K^(z).  By  examining  the  propagation  of  coefficients  z°  and  z^° 
in  (4.4),  it  can  be  shown  that  g„(0)  =  ^„(0)  =  1  and  /i„(n)  =  y„(n)  =  0.  Although  it  might  seem 
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9 


9 
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that  many  entries  in  (4.7a)  are  zero,  since  reflections  can  only  reach  the  surface  at  integer  linear 
combinations  of  the  T.,  in  point  of  fact  most  entries  are  non-zero  due  to  multiple  reflections. 

Eq.  (4.7)  shows  the  equivalence  of  the  computation  of  the  elements  of  the  transition 
matrices  M^(z)  and  M„(z)  (the  z-transform  of  the  discrete  matrix  Green's  function)  and  the 

solution  of  two  asymmetric  Toeplitz  systems  which  act  as  discrete  counterparts  to  integral 
equations.  In  (4.7)  the  discrete-time  free-surface  reflection  responses  {r(n),  f(n)}  from  the 

actual  and  adjoint  systems  are  assembled  into  an  asymmetric  Toeplitz  matrix,  which  is  solved  to 
recover  the  transfer  coefficients  at  each  interface.  A  solution  by  Gaussian 

elimination  of  the  nxn  system  of  equations  =  b,  requires  O(n^)  multiplications  and 

additions. 


4.3.  Solution  using  Asymmetric  Levinson  and  Schur  Algorithms.  An  nxn 
Toeplitz  system  of  equations  can  be  solved  recursively  using  the  Levinson  or  Schur  algorithms, 
which  requires  O(n^)  multiplications  and  additions.  For  ease  of  notation  and  familiarity,  we  make 
the  following  substitutions: 


• 

(4.8a) 

G  ->  Kn) 

(4.8b) 

K  ^ n-l.n 

(4.8c) 

K  =  L-i(l  -  kikl)  <7,  =  a,_,^l  -  r,_, 

• 

(4.8d) 

A,(z)  z'""G,(z)-z*""y,(z  ')  =>  a,,  g,(i)-y,(n  -1) 

(4.8e) 

S, (z)  ->  z* (z'‘ ) -  z'^” //„ (z)  =>  ->k^{n-i)-h^ (i) 

Asymmetric  Levinson  algorithm  [13] 


Step  1 .  Initialize:  n  =  0,  \{z)  =  Bq{z)  =  \ ,  and  tQ  =  r^. 


S,ep  2.  kl,  =  and  kl,  =  - 


1=0 
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Step  3. 


1 

]A  z' 

9 

/'■n.d+l 

z 

Step  4. 

Step  5.  Increment  n,  return  to  Step  2. 


Note  that  the  main  recursion  (Step  3)  is  identical  to  (4.3a),  after  the  substitutions  (4.8).  In  fact,  the 
Levinson  algorithm,  applied  to  this  problem,  is  simply  a  recursive  implementation  of  (4.3a).  For 
the  lossless  case,  we  note  that  (4.7)  is  symmetric,  and  thus  the  symmetric  Levinson  algorithm  may 
be  used. 

If  we  are  not  interested  in  recovering  the  components  of  the  transition  matrices,  but  only 
interested  in  the  transfer  coefficients  {  r„  s,  ,.,, },  we  may  use  the  asymmetric  Schur  algorithm: 

Asymmetric  Schur  akorithm  [13] 


9 


9 


Step  1.  Initialize:  n  =  0,  Dg(z)  =  C/g(z)  =  R(z), 

Dg(z)  =  Ug(z)  =  R(z). 


1 

k}  z 

'u.(4 

z 

PM). 

1 

z^ 

'0.{z) 

lA 

z 

.DM). 

Step  3. 
Step  4. 


i}  = 


S.O 


K  =  —■ 


V-1 


S,o 


Increment  n ,  return  to  Step  2. 
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As  an  aside,  we  note  that  our  term  "adjoint  system"  first  appeared  in  [13],  where  it  was 
used  entirely  as  a  linear  algebraic  convenience. 

5.  Discrete  Lossy  Transmission  Line. 

The  lossless  transmission  line  has  been  used  to  motivate  discussion  of  discrete  inverse 
problems  [1].  Inverse  scattering  for  a  discrete  lossy  transmission  line  is  a  natural  extension  of  this 
previous  work.  However,  this  problem  is  of  considerable  interest  in  its  own  right,  since  it  can  be 
used  to  model  interchip  communications  [9]. 

5.1.  Basic  Equations.  A  segment  of  a  discrete  lossy  transmission  line  is  shown  in 
Fig.  8.  Over  a  finite  interval  {  AJ,  the  distributed  series  resistance/?^,  series  inductance  L^,  shunt 
conductance  G„,  and  shunt  capacitance  C„,  all  per  unit  length,  are  constant.  We  omit  the 
derivation  of  the  telegrapher's  equations  and  wave  equations,  since  these  may  be  found  in  [5]  and 
[9]. 


+  O- 


-Wr^ 


'n  fi  n  n 

YY\. 


C.A. 


i.  I 
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*  n+l 


Fig.  8  Segment  of  a  Discrete  Lossy  Transmission  Line 


The  solutions  ( v,  /}  to  the  transmission  line  wave  equations  anywhere  on  the  line  is  a 
combination  of  incident  {  v‘,  i‘  ]  and  reflected  {  v",  f  }  waves  [1].  Consider  the  propagation  of 
the  energy-normalized  sinusoidal  voltage  wave  components  {  V‘  =  V  =  having 

frequency  O),  where  the  characteristic  impedance  {Z}  in  a  segment  of  the  lossy  line  is  the 


complex  quantity  Z  =  1  ^  —  .  The  solution  of  the  wave  equations  in  ten  of  the  energy- 

y  G  +  jcoC 

normalized  wave  components  {  V‘,  V]  at  a  point  z  on  the  line  is 


(5.1a)  v  =  Z2 


-  jcat 


jcx  \ 


V'e^^e  "  +  V'e‘‘^e  '' 


V 


(5.1b)  /  =  Z  2 


-  jon 


JOB  A 


V‘e'‘^e  "  -  V^e^'^e  '' 


(f 


where  the  wave  velocity  v  = 


^  r 

—  +  L 


vv 


-  +  C 

J 


(note  the  complex  quantity  v  includes 


propagation,  attenuation,  and  dispersion  effects).  Our  goal  is  to  find  the  relationship  between 
wave  components  { V^,  of  one  section  and  those  {  }  of  the  next. 


5.2.  Formulation, 

5.2A.  Interface  Effect.  At  an  interface  at  position  z  =  z,^]  =  z,  +  A,  between  two 
sections  having  dissimilar  impedances  Z,  ^  Z„+,,  the  voltages  and  currents  must  be  continuous  by 
Kirchoffs  laws,  i.e.,  v'  =  and  /'  Using  (5.1)  and  equating  voltages  and  currents  on 

the  opposite  sides  of  the  interface,  we  have  that  the  wave  components  {  }  just  to  the  right 

of  the  interface  at  z  =  z„^,  are  related  to  the  components  {  V'',  V" )  just  to  the  left  of  the  interface 
at  z  =  z,^,  by 


“ 

-/“..I  ‘ 

V‘  e 

Vo.! 

1 

■  1 

_r 

V'  e 

V..! 

Vi-r;,... 

_r 

H.H-hl 

1 

1 - 

where  f, ,,, 


is  the  reflection  coefficient  between  the  n"* 


and  (/i  +  l)'"  layers,  i.e.,  for 


the  interface  at  z  =  z„^i . 
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5.2B.  Homogeneous  Segment  Effect.  Since  F  0  only  at  an  interface,  within  the 
homogeneous  transmission  line  segment  {  z  e  },  the  wave  components  {  F'‘ ,  V'" }  at  the 

end  of  the  section  are  related  to  those  {  V'J  at  the  beginning  by  the  propagation  delay  through 
the  layer.  The  propagation  constant  {  — }  can  be  expressed  as  the  complex  sum  of  phase  {  (5^ ) 

and  attenuation  {  }  constants,  i.e.,  —  =  jl5„  +  a,.  In  matrix  form,  we  have 


- 

e 

0 

-jon^ 

V^e 

e  "  " 

0 

*g-AA, 

0 

-/COZ^ 

1 

O 

0 

0 

e 

1 

g 

> 

5.2C.  Combined  Effect.  Combing  (5.2)  and  (5.3)  gives  the  relationship  between  the 
wave  components  of  one  layer  {  V^,  V')  and  those  in  the  next  {  }; 


V‘  e 

1 

g-a.A. 

-F 

■g->^,A, 

0 

- " 

V  p 

V'-F.., 

-F 

€ 

0 

g/^.A. 

jCiX^ 

y:e~  _ 

Since  it  has  zero  length,  the  interface  itself  is  lossless;  we  therefore  have 

_  Using  this  relationship,  we  introduce  the  phasor  wave 

components  {  }  as 


(5.5a) 

V‘  e 

'  B  +  l*- 

n 

—  y/*  ^  —  y/*  ^  yPn+l^w+l  ^  ^  i  =  0 

+n“,A, 

—  I/'-  —  V"'  +  _  V''  /J  '-o 

~  ^  t  t  —  »„+lC 

(5.5b) 

1/''  p 
^  rt+1^ 

-na.A, 

where  e  is  the  cumulative  loss  of  the  previous  layers.  Substituting  (5.5)  into  (5.4),  we  have 


our  desired  relationship: 


5.2D,  Discrete  Time  Formulation.  We  now  make  either  of  the  following  two 
assumptions: 

(1)  The  transmission  line  is  dispersionless,  meaning  that  the  Heaviside  condition 
holds,  i.e.,  the  two  time  constants  in  Fig.  8  are  equal;  or 

(2)  The  transmission  line  is  a  high-frequency  line,  so  that  jcoL^  »  and  y'tyC,  »  G„.  We 
discuss  this  assumption  in  more  detail  in  Section  5.3. 
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Either  assumption  implies  that  the  impedance  {  },  phase  constant  { /3J,  and  attenuation 


constant  {  a, },  may  be  approximated  by 


(5.8a) 


(5.8b)  P.~o,4l;c:; 


(5.8c) 


c.z„  +  — 

n  n  2 


or 


■Mfi. 


«  / 


We  now  define  the  transfer  coefficients  {  s,  }  using  the  frequency-independent 

attenuation  coefficient  {  a, }  in  (5.8c)  as 


(5.9a) 


r  =  r  e 

*  n,n+\^ 


+2 


(5.9b) 


s  =  r  € 


Note  for  a  lossless  section  of  transmission  line  a,  =  0,  so  that  Also  note  that  the 

transmission  loss  is  t, <1. 

Since  the  time-delay  matrix  0^,^,(z)  can  be  expressed  using  the  z-transform,  and  the 
transfer  matrix  Z,  is  independent  of  <D  or  z,  we  can  combine  them  into  the  discrete-time  layer 
propagation  matrix 


(5.10) 


F 


1 


1 

^n.n+l 

’z-"- 

0  ■ 

1 

_  0 

z"-_ 

where  r„  and  s,  are  defined  in  (5.9). 

Eq.  (5.10)  is  the  discrete-time  counterpart  of  the  continuous-time  asymmetric  two- 

component  wave  system  (122)  of  [5].  To  see  this,  assume  the  line  is  dispersionless  and  let  the 

time  increment  A  — >  0,  along  with  the  layer  thicknesses.  The  reflection  coefficient  T,  becomes 

m(z)  =  — ^ ((117)  of  [5])  and  and  become  (123)  of  [5].  The  latter  also 
2Z(z)  dz 

appeared  in  [14],  where  they  were  reconstructed  by  solving  a  coupled  pair  of  integral  equations 
which  are  the  continuous-time  counterparts  of  the  two  asymmetric  Toeplitz  systems  (4.7a). 


These  continuous-time  equations  include  dispersion  effects,  since  they  are  derived  direcdy 
from  the  telegrapher's  equations.  Ho\vever,  the  discretized  versions  of  these  continuous-time 
equations,  which  in  effect  reconstruct  a  discrete  layered  medium  such  as  considered  in  the  present 
paper,  do  not  correctly  model  dispersion.  Hence  some  analysis  of  the  relative  importance  of 
dispersion  vs.  absorption  is  required  for  numerical  implementation  of  any  of  these  equations. 


5.3.  Relative  Effects  of  Dispersion  and  Absorption.  The  assumptions  of  (5.8) 
made  in  Section  5. 2D  are  tantamount  to  neglecting  dispersion  (frequency-dependent  effects  that 
alter  the  shape  of  the  probing  pulse)  as  compared  to  absorption  (frequency-independent  effects  that 
attenuate  the  amplitude  of,  but  do  not  alter  the  shape  of,  the  probing  pulse).  We  now  show  why 
this  is  a  reasonable  assumption. 

We  consider  the  effect  of  propagation  within  a  lossy  homogeneous  layer.  From  (5.1),  this 
is  c  ,  where  z  is  distance,  the  sign  depends  on  upward  versus  downward  propagauon,  and 


V  = 


L  *f 


R 


c+- 


is  the  complex  wave  speed.  Using  the  binomial  expansion 


1 

•f— 


(\  + e)  ^  =  1-i-  — -  — -i-(9(£^),  we  have 
2  8 


(5.11a) 


z 

ya>- 


e  ’  =  exp 


jcoz^LC 


'  \  {R  1  RG 

1  H - - 1 H - j - 

jco\L  C )  {j(0)  LC 


(5.11b) 


(5.11c) 


=  exp 


y  £Wz  V  LC 


!  +  ■ 


1  (R  G 


=  exp 


ja,z4Ic[l  +—{-  +  -'\ - - 

IjcoVL  C)  S{ja 


1 

RG 

1  f 

>  2(jcof 

LC 

8{jcof  [ 

1  ^  1 

^R 

■Ihi 

/  8(jcof ' 

<L~ 

CO 


CO^ 


jcoz-'JTc 


!  +  • 


jco  KJcof 


■  +  0 


yCO^JJ 


(5.1  Id) 


=  exp 


1  / 

'R 

O') 

''R 

- 

—  + 

— 

and  b-—\ 

2i 

.L 

cj 

'  2\ 

~  CJ 

Analysis  of  the  final  bracketed  expansion  shows  the  relative  significance  of  various 

propagation  effects.  The  O'^-order  term  { 1 }  multiplying  jcoz-sfiC  is  an  eikonal  term,  which 

simply  expresses  the  fact  that  the  wave  propagates  at  speed  —}= .  The  first-order  term  {  — } , 

■\jLC  jco 

after  multiplication  by  jcoz^jLC ,  simply  represents  the  absorption  (note  a  =  a-y[LC  for  a  defined 


in  (5.8)).  The  second-order  term  { 


]  represent  the  most  significant  effects  of  dispersion 


Kjcor 

(note  the  condition  for  a  dispersionless  line  is  precisely  b  =  0). 

The  significance  of  this  expansion  is  that  dispersion  is  a  higher-order  asymptotic  (in  low 
CO)  effect  than  absorption.  Comparing  this  expansion  to  low-frequency  asymptotic  expansions  in 
optics,  we  see  that  for  high-frequency  pulses,  dispersion  will  be  negligible  compared  to  absorption 
and  interface  reflection.  This  is  not  apparent  from  the  continuous-time  development  in  [5],  so  that 
the  discrete  formulation  again  provides  new  insights. 

A  similar  analysis  can  be  carried  out  for  attenuation  and  dispersion  caused  by  reflection  and 
transmission  of  waves  at  interfaces.  We  have 


(5.12) 


I R  +  jcoL 
G  -t-  jcoC 


vcl,  2j(o[L  cj  [co^JJ  ]C[ 


1  +  —  +  0 
Jco 


f— V 


This  shows  that  dispersion  will  again  be  a  low-frequency  asymptotic  correction  to  the  frequency- 

177  Z  -Z 

independent  attenuation  caused  by  Z  = .  —  in  T,  ,^,  =  Note  that  for  a  dispersionless 

V  C 


""a+I 


line  ( ^  =  0 )  this  correction  is  again  zero. 

Note  that  discretized  versions  of  the  continuous-time  equations  in  [5]  and  [6]  include 
dispersion  as  a  continuous  scattering  phenomenon,  modeled  by  the  reflectivity  function  b(z)  in 
[5].  Medium  inhomogeneities  are  modeled  by  the  reflectivity  function  m(z)  defined  above;  b{z) 
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and  m(z)  are  weighted  equally.  While  this  is  correct  in  continuous  time,  the  above  analysis  shows 
that  for  discretized  equations,  corresponding  to  a  discretized  medium,  these  effects  should  not  be 
treated  as  having  equal  importance.  In  short,  discretized  continuous  equations  give  the  wrong 
answer;  an  explicitly  discrete  formulation  is  called  for. 

6.  Electromagnetic  Waves  in  Layered  Dielectrics.  We  now  extend  the 
transmission  line  formulation  to  the  case  of  electromagnetic  plane  waves  propagating  in  a  lossy 
layered  dielectric.  Instead  of  segments  of  transmission  line  with  uniform  impedances  { ZJ,  we 
now  have  layers  of  dielectrics  having  infinite  lateral  extent  and  uniform  dielectric  constants  {  £„ } . 
In  reading  this  section,  parallels  to  Section  5  should  be  noted  throughout  The  forward  and  inverse 
scattering  problems  here  have  applications  in  radioglaciology  [7]  and  radar  reflections  off  of 
stratified  dielectrics  [8].  We  have  applied  these  results  to  radioglaciology  in  [15]. 

6.1.  Basic  Equations.  The  general  form  of  a  time-harmonic  propagating 
electromagnetic  wave  consists  of  orthogonal  electric  {E  =  Ee  }  and  magnetic  {H  =  He  ^  } 

field  vectors  having  amplitudes  {  E  =  }  and  { //  =  },  respectively,  and  frequency  { ty } . 

The  propagating  waves  { E ,  H }  each  have  a  temporal  {cot}  and  spatial  { k  •  r }  component,  where 
k  =  k^x  +  kyY  +  kjZ  is  the  propagation  vector  and  r  =  xx  +  yy  +  zz  is  the  displacement  vector.  The 

magnitude  of  the  propagation  vector  is  the  wavenumber  or  propagation  constant  and  is  given  by 
/:  =  |k|  =  oj-sffle.  By  the  consistency  condition,  k  =  ^Jkf+  k^  +  k^  . 

In  general,  the  dielectric  constant  {  e }  is  complex;  it  may  be  expressed  as  e  =  e'-je", 
where  £'  =  permittivity  and  £"=  loss  factor.  Alternatively,  we  may  write  e  =  where  a= 

conductivity.  Both  lossless  (real  £)  and  lossy  (complex  e)  dielectrics  are  considered.  We  make 
the  common  assumptions  that  the  dielectric  is  simple  (i.e.,  stationary,  linear,  and  isotropic)  and 
non-magnetic  (i.e.,  pL=pL^  =  permeability  of  free  space). 

An  infinite-extent  planar  TE  wave  is  now  considered.  The  plane  wave  is  incident  on  the 
layered  dielectric  at  an  angle  6^  relative  to  the  direction  z,  which  is  normal  to  all  dielectric 


boundaries  (see  Fig.  9).  From  the  TE  condition,  the  electric  field  is  completely  perpendicular  to 
the  plane  of  incidence;  this  is  also  known  as  being  perpendicularly  or  horizontally  polarized. 


The  TE  electric  field  is  given  by 
equation  for  a  TE  wave  is  thus: 


=  y£  =  yE  e  The  homogeneous  wave 


,21 

- 1 - u 

dx^  dz^  ) 


V 


£^  =  0,  obtained  by  substituting  into 


V^E  +  ;t'E  =  0 

The  solution  for  (the  field  in  the  n'^  dielectric  layer  )  has  two  wave  components,  one  in 
the  +z  direction  {  A. }  and  the  other  in  the  -z  direction  { C, ).  We  energy-normalize  these 
components  to  D,  and  (7,  such  that  A,  =  D,(^/i7cos0,)  ^  and  C^  =  U^{pJe^ cos 6^^ 
where  6^  is  the  angle  between  the  direction  of  travel  of  the  plane  wave  and  the  +z  direction,  in  the 
n‘^  layer.  Applying  Maxwell's  equations  to  the  solution  for  to  obtain  //„,  the  tangential 


propagating  fields  are  found  to  be 


(6.1a) 


(6.1b) 


H,.  = 


(yfT,  cos 

_-(Je^cose^ 
Vo 


)e''^ 


29 


9 


where  =  ■^|^  =  3170.  is  the  intrinsic  impedance  of  free  space,  and  the  propagation  constant 
{  }  for  a  layer  is  related  to  its  components  { k^,  }  by  k^„  =  k^  cos  0,  and  (using  Snell's  law) 

Ic^  =  k^  sin  0^  =  ko  sin  6^  =  co^^llo^o  ^o-  The  physical  relationship  between  these  components 

and  the  dielectric  boundaries  is  shown  in  Fig.  10. 


Fig.  10  Wave  definitions  in  layered  dielectrics 


m 


The  goal  at  present  is  to  find  the  relationship  between  the  fields  {  D,,  (7, }  in  one  dielectric  layer 
and  those  {  }  in  the  next. 


6.2.  Formulation. 

6.2A.  Interface  Effect.  From  Faraday's  and  Gauss’s  laws,  at  the  interface  z  =  , 

we  have  continuity  of  tangential  electric  and  magnetic  fields,  i.e.,  F',  =  £'^,(,+1)  and  H'^  = 

Thus  the  field  components  {  }  just  below  the  boundary  are  related  to  the  components 

{  D',  t/' }  just  above  the  boundary  by 


(6.2a) 


1 

■  1 

i/i 

1 

(6.2b) 


R. 


V^cosg.-yi^cosg,,, 
V^cos0,  + Villi’ cos  ■ 
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At  any  interface  of  dissimilar  dielectrics,  both  components  of  a  propagating  electromagnetic  wave 
•  encounter  impedance  mismatches.  At  the  interface  a  portion  of  the  incident  wave  will  be  reflected, 

while  the  remainder  will  be  transmitted.  The  ponion  reflected  is  determined  by  the  TE  Fresnel 
reflection  coefficient  { } . 


6.2B.  Homogeneous  Layer  Effect.  Within  a  homogeneous  dielectric  layer,  the 
reflection  coefficient  is  zero,  so  that  the  relation  between  the  waves  {  D',  t/' }  at  the  bottom  of  the 
layer  and  those  {  D,,  t/, }  at  the  top  of  the  n**  layer  is  as  follows.  Let  A,  =  -  d,  be  the 

thickness  of  the  layer  and  jk^  be  the  propagation  constant:  jk^  =  “  -/A  +  is  the 

complex  sum  of  phase  { /?,  =  attenuation  {  a,  =  )  constants.  Then 


(6.3) 


~  -a.A.secS, 

e 

0 

0 

0 

^+a.A.tece. 

0 

6.2C.  Combined  Effect.  From  phase  continuity  of  tangential  components  at  a 

■  ■ 

-J^a,A;sec0|  -J^a,A,iece, 

boundary,  we  have  -o  ^  where  e  is 

the  cumulative  loss  factor  [LJ.  Defining  the  phasor  wave  components  D„  =  D^e~^^'‘‘-  and 
U,  =  ,  we  again  fmd  the  relationship  between  phasor  components  { D, ,  U, }  and  {  , 

U,+i }  given  in  terms  of  a  frequency-dependent  layer  propagation  matrix  {  F,  ) : 


(6.4a) 


«+l 


=  F,.,,,(ty) 


D- 

L. 


=  2:,,,,,(cy)<D„^.(£o) 


D, 

U. 


with  transfer  matrix 
(6.4b) 

and  a  time -delay  matrix 


-R. 


1 

m 

”2^0[^A„iec 
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(6.4c) 
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6.2D.  Discrete-time  Formulation.  If  we  make  the  assumptions  that  the  dielectrics 
are  materials  low  in  loss  { cue' »  a}  and  relatively  independent  of  frequency  {  e'(cu)  =  e'},  the 
following  approximation  can  be  made: 


(6.5) 
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We  again  introduce  the  frequency-independent  transfer  matrix  coefficients  {  r,  }  as 


(6.6a) 
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(6.6b) 
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Note  that  both  the  phase  constant  { }  and  traveled  distance  {  A,  sec  0, )  in  a  layer  may  be  written 
in  terms  of  the  phase  velocity  {v„}  for  the  layer,  i.e.,  and 

A,  sec  =  v^n,A  [16].  We  again  assume  that  travel-time  through  each  layer  is  an  integer  {  n^ } 
multiple  of  a  small  time-increment  {  A } .  Then  the  forward  layer  propagation  matrix  can  again  be 
expressed  using  the  z-transform  as 


(6.7) 
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The  asymptotic  analysis  of  Section  5.3  can  be  applied  to  electromagnetic  wave  propagation 
as  well,  and  will  not  be  repeated  here. 
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7.  Numerical  Example.  We  provide  a  simple,  illustrative  numerical  example.  The 
goal  is  to  reconstruct  the  unknown  lossy  three-layer  dielectric  in  free  space  shown  in  Fig.  11. 

(T  — »  oo 

e  =  1 

e  =  4  -  .04i 

e  =  1  -  .Oli 

e  =  9-.09i 
e  =  1 

Fig.  11  Layered  dielectric  model 

A  video  pulse  system  operating  at  10  GHz  is  simulated.  The  values  of  permittivity  {  e' }  and 
thickness  { A, }  were  chosen  so  that  all  returns  from  normal  probing  occur  at  an  integer  multiple  of 
.05  nsec  (of  course,  a  smaller  value  could  also  be  used).  The  loss  factors  {  e" }  were  chosen  so 

that  the  layen  would  substantially  attenuate  the  signals,  but  small  enough  such  that  the 
assumption  is  valid.  The  time-line  bounce  diagram  for  normal  probing  of  this  model  is 

shown  in  Fig.  12. 


4  .06  m 
.12  m 


.12  m 

.06  m 


Fig.  12  Bounce  diagram  for  normal  probing 


We  calculate  the  attenuation  coefficient  for  each  layer  as  a,  =  kj^lm  ^|£^\  , 

where  e  =£'-  ie''  /  =  10  GHz  and  c  =  3  - 10*  —  (the  speed  of  light  in  free  space).  The  layer 

s 

parameters  are  given  in  Table  1. 


Table  1  Lossy  model  parameters 


Laver  (n) 

K  (m) 

t,  (nsec) 

e" 

e 

HU 

1 

.06 

.40 

0 

0 

1.00 

(1) 

4 

.12 

1.60 

.04 

2.09 

.605 

(2) 

1 

.12 

.80 

.01 

1.05 

.778 

(3) 

9 

.06 

1.20 

.09 

3.14 

.686 

1 

oo 

OO 

0 

0 

1.00 

9 
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The  reflection  coefficients  {  }  (calculated  using  (6.4b)),  loss  factor  {  L, }  for  each  interface 

(the  cumulative  loss  of  the  preceding  layers,  as  given  in  Section  6.3C),  and  transfer  coefficients 
{  r,  s,  )  for  each  interface  of  the  lossy  model  (computed  using  (6.6))  are  all  shown  in 

Table  2. 


Table  2  Lossy  model  transfer  coefficients 


lntCTface(n,  /i  +  l) 

W+l 

0.  1 

-.333 

1.00 

-.333 

-.333 

1.2 

-(-.333 

.605 

+.202 

-^.551 

2,3 

-.500 

.471 

-.235 

-1.063 

3,  4 

-(-.500 

.323 

-(-.161 

1.549 

9 
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The  actual  { r(n)}  and  adjoint  {  r(n))  system  free-surface  reflection  responses  for  the 


lossy  model,  computed  using  (3.7),  are  shown  in  Fig.  13a  and  Fig.  13b,  respectively. 


Fig.  13b  Adjoint  system  reflection  response  vs.  time  (nsec) 


The  sampled  responses  r(n)  and  r(n)  are  assembled  into  an  asymmetric  Toeplitz  system, 
which  is  solved  using  the  asymmetric  Levinson  algorithm.  This  produces  two  sets  of  transfer 
coefficients  as  functions  of  elapsed  time:  one  for  the  actual  system  {  r};  and  one  for  the  adjoint 
system  { s}.  The  transfer  coefficients  { r,  ,+j,  s,  }  for  the  lossy  model  are  shown  in  Fig.  14a 

and  Fig.  14b,  respectively.  Comparing  Fig.  13  and  Fig.  14  shows  the  significance  of  multiple 
reflections. 


From  these  values  we  can  obtain  r„  and  s,  Hence,  the  lossy  medium’s  parameters 
{£„,  A,,  and  a„  V  «)  can  be  completely  reconstructed  using  (6.26),  (6.5),  and  (6.6).  The 
reconstructed  parameters  are  identical  to  the  actual  parameters  in  Table  1,  and  are  not  repeated  here. 


Fig.  14b  Transfer  coefficients  { s )  for  adjoint  system  vs.  time  (nsec) 


8.  Conclusion.  We  have  extended  the  development  of  a  DSP  theory  for  forward  and 
inverse  scattering  from  discrete  lossless  systems  to  discrete  lossy  systems.  Discrete  lossy  systems 
are  in  fact  more  useful  for  modeling  many  real-world  scattering  and  inverse  scattering  problems. 
The  advantage  of  the  DSP  formulation  is  that  the  signal  processing  can  be  carried  out  exactly, 
whereas  continuous-time  problems  must  be  discretized,  and  effects  such  as  transmission  losses 
and  dispersion  do  not  appear.  The  development  of  inverse  scattering  for  discrete  lossy  systems 
was  presented.  Major  contributions  of  this  paper  include:  (1)  a  discrete-time  formulation  of 
scattering  for  lossy  systems;  (2)  an  explicit  derivation  of  the  asymmetric  Toeplitz  system  for 
inverse  scattering  for  a  discrete  lossy  system;  (3)  explicit  formulations  of  the  discrete  lossy 


transmission  line  and  layered  dielectric  medium  problems  as  discrete  asymmetric  two-component 
wave  systems;  (4)  discussions  of  the  significance  of  data  feasibility  for  lossy  inverse  scattering 
problems;  and  (5)  an  asymptotic  frequency  analysis  of  the  lossy  transmission  line,  and  a 
discussion  of  the  relative  significance  of  absorption  and  dispersion  effects.  The  latter  analysis 
demonstrates  that  for  real-world  problems  with  high-frequency  pulses,  our  DSP  formulation 
models  the  significant  features  of  the  problems. 
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APPENDIX  E2 

J.  Frolik  and  A.E.  Yagle,  “Reconstruction  of  Multi-Layered  Lossy  Di¬ 
electrics  from  Plane  Wave  Impulse  Responses  at  Two  Angles  of  Incidence,” 
submitted  to  IEEE  Trans.  Geosci.  and  Rem.  Sensing,  June  1993. 

The  problem  posed  in  Appendix  El  is  solved  using  plane  wave  reflection  response 
at  two  angles  of  incidence,  rather  than  reflection  and  transmission  data  (the  latter  would 
not  be  available  in  remote  sensing  applications).  This  includes  a  novel  seni-iterative  use 
of  layer  stripping.  Numerical  examples  on  reconstructing  a  glacial  ice  shelf  from  radar 
reflections  demonstrate  the  significance  of  modelling  multiple  reflections  and  losses. 
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Abstract — Motivated  by  the  radioglaciology  inverse  problem, 
we  present  new  algorithms  for  reconstructing  a  lossy,  stratified 
dielectric  from  its  impulsive  plane  wave  refiection  responses 
at  two  different  angles  of  incidence.  Novel  features  of  these 
algorithms  include:  1)  a  digital  signal  processing  formulation 
that  does  not  require  discretization  of  continuous  equations;  2) 
use  of  the  asymmetric  Levinson  algorithm  for  rapid  solution 
of  die  forward  and  inverse  problems;  3)  a  novel  use  of  layer 
stripping  ideas,  featuring  iteration  between  the  forward  and 
inverse  problems,  with  each  iteration  recursively  determining 
another  layer  of  the  medimn;  and  4)  another  recursive  algorithm 
for  determining  the  bottom  lossy  half-space  from  reflecti<m  data 
only.  Numerical  examples  iOus^te  the  new  algorithms  <mi  the 
reconstruction  of  a  synthetic  but  realistic  layered  ice  shelf  from 
both  noiseiess  and  noisy  radar  reflection  data. 


I.  Introduction 

The  electromagnetic  plane  wave  inverse  scattering  prob¬ 
lem  is  to  compute  the  electrical  parameters  (e.g.,  per¬ 
mittivity,  attenuation  constant)  of  a  layered  dielectric  tom 
scattering  data  (e.g.,  reflection  response  to  an  impulsive  plane 
wave).  Previous  methods  can  be  classified  into  two  categories; 
1)  those  based  on  integral-inverse  algorithms  (e.g.,  Gel’fand- 
Levitan-Marchenko  and  parameter  optimization  techniques); 
and  2)  those  based  on  differential-inverse  algorithms  (e.g., 
layer  stripping  techniques).  Comprehensive  reviews  of  pre¬ 
vious  work  can  be  found  in  [1],  [2]  and  their  assoriati^ 
references. 

A  few  methods  applicable  to  the  case  of  lossy  dielectrics  are 
as  follows:  Iterative  methods  for  simultaneously  reccmstnKting 
permittivity  and  conductivity  profiles  using  the  distorted  Bom 
approximation  have  been  applied  to  a  variety  of  geometries 
[20].  However,  this  method  is  suitable  only  fat  smoo^iiy 
varying  media  and  would  not  be  ^jpropriate  for  mvestigat- 
ing  stratified  media  having  dissimilar  layers.  Thne  rfftmain 
methods  have  been  used  to  reconstruct  conductivior  pAifiles  as 
presented  in  [21]  and  its  references.  However,  Ais  worit  has 
assumed  that  the  permittivity  profile  is  known  a  priori.  A  third 
alternative  method  of  reconstructing  these  (xofiles  based  cm  the 
Fourier  transform  of  surface  impedance  data  was  pnesented  in 
[3].  This  method  assumes  that  the  layer  travel  times  are  all 
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equal,  so  that  the  scaled  discrete-time  Fourier  transform  of  the 
reflection  response  will  be  periodic  with  period  in  Hertz  equal 
to  the  reciprocal  of  the  travel  time.  If  the  reflection  response 
is  measured  over  one  period  at  very  high  frequencies,  the 
variation  of  medium  response  with  frequency  can  be  neglected. 
However,  this  method  will  break  down  completely  unless  the 
layer  thicknesses  are  all  equal,  and  even  then  it  will  require 
measurement  at  impractical  high  toquencies.  For  a  travel  time 
of  0.5  nsec,  which  we  use  in  the  examples  of  this  pqjer,  a  1% 
variation  of  loss  would  require  measurements  over  the  range 
of  200-202  GHz;  such  frequencies  would  not  penetrate  far 
anyway.  Although  we  assume  the  medium  is  probed  w^  an 
impulsive  plane  wave,  in  fact  any  fast  pulse  of  duratioi  less 
than  the  travel  time  between  interfaces  may  be  used. 

Methods  presented  in  this  paper  are  recursive  algorithms 
for  reconstructing  both  permittivity  and  attenuation  profiles  in 
layered  media  that  find  their  origins  in  the  differential-inverse 
class.  This  pq)er  differs  tom  previous  wmk  in  four  important 
respects:  1)  we  present  algorithms  f«  reconstructing  a  lossy 
dielectric,  consisting  of  a  stack  of  discrete,  homogeneous, 
and  absorbing  layers,  tom  only  the  reflection  responses  to 
impulsive  plane  waves  at  two  diffoent  angles  of  incidence;  2) 
the  algorithms  can  be  imphnnented  in  terras  of  digital  signal 
processing  (DSP)  techniques  and  algorithms — no  discretiza- 
tioa  of  continuous  equations  is  ret^wed;  3)  a  travel,  recursive 
use  of  layer  stcqpimig  ideas,  in  which:,  los^  layers  of  the 
stratified  dielectric  are  ncursivcLy  recoQstiucted  by  iterating 
betweoi  the  faewaHl  md  iawene  gfo^n^  both  of  which 
can  be  solved  r^idiy  by>  d»  aiQpnraetric  Levinson 

algoritfam  forwasd  and  backwaK^.  nd  #  the  first  algorithm 
that  can  reconstract  die  bottmn,  Joa^>  half-space  entirely  tom 
the  reflection  response  «fatn 

It  is  imperumt  to  rect^nze  drat  ttcopf^ractimi  of  attenua¬ 
tion  profiles  d^lem  significaiitty  toqiiatqfHtiltoction  of  layer 
parameters  sudr  as  pqrraitriyiQiF;JHiqil|fiiit.cah:  be  discerned 
tom  the  interface  ieflectkatco<iSqi«it,  bq|;  the  former  does 
not  appear  in  die  Rdkction  coeiSckq^.'^.glltaiBatkHi  of  a 
layer  changes  in  the  ssne  way  that  the  piiise  velocity  does  as 
the  angle  of  incidence  is  var^  Heaceji^ro^ig  at  more  than 
two  angles  of  inciderrae  does  not  help,  fiteww,  it  does  help 
if  the  data  are  noisjrX^.  ”  ^ 

We  also  [neseiit  inhnesicaX  demonstrate 

the  significance  of  multipie  reflections  and  absorption  on  re¬ 
construction;  2)  demonstrate  that  the  new,  proposed  algorithms 
work  properly;  3)  demonstrate  the  significance  of  the  recursive 
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bedrock  sea  waio’ 

(b) 

p,g.  1.  Glacier  models  (a)  single  layer  model,  (b)  multi-layer  model. 


algorithm  in  reconstructing  a  lossy  layered  dielectric,  espe¬ 
cially  an  unknown  lossy  lower  half-space;  and  4)  demonstrate 
the  effectiveness  of  the  routines  in  the  presence  of  noise. 

This  paper  is  organized  as  follows:  Section  n  discusses 
the  radioglaciology  inverse  problem  which  motivates  this 
work  (although  it  can  also  be  applied  to  magnetotelluric 
exploration,  FM-CW  subsurface  radar,  and  radar  reflections  off 
of  layered  composites,  e.g.,  aircraft  skins).  Section  HI  derives 
the  asymmetric  wave  system  that  is  used  to  characterize  elec¬ 
tromagnetic  wave  propagation  in  lossy,  stratified  dielectrics 
(this  is  discussed  in  more  detail  in  [5]).  Section  IV  shows  how 
running  the  asymmetric  Levinson  algorithm  forward  and  back¬ 
ward  solves  the  inverse  and  forward  problems,  respectively. 
Section  IV  also  presents  a  simpler  algorithm  for  low-loss 
media.  Section  V  shows  how  the  algorithms  of  Section  IV  can 
be  used  recursively  to  reconstruct  the  bottom  layer  of  a  lossy 
layered  medium.  Section  VI  presents  numerical  examples  that 
demonstrate  the  operation  and  success  of  the  algorithms  in 
reconstructing  realistic  lossy  layered  dielectrics,  including  the 
bottom  half-space,  from  both  noiseless  and  noisy  data.  Section 
vn  concludes  with  a  summary. 


II.  Summary  of  Radioglaciology 


Radar  remote  sensing  is  a  one-sided  probing  method  used 
to  investigate  both  naturally-occurring  and  man-made  phenom¬ 
ena,  e.g.,  soils,  forest  canopies,  glaciers,  walls,  etc.  At  large 
wavelengths  the  boundaries  between  layers  is  effectively  a 
planar  interface,  thus  many  of  these  media  can  be  modeled 
as  discrete  multilayer  lossy  media.  For  example,  a  vegetation 
canopy  can  be  modeled  as  two  layers  of  vegetation  over  a 
ground  surface.  The  top  layer  of  vegetation  contains  branches, 
leaves,  needles,  etc.,  and  the  bottom  contains  tree  trunks  [6]. 

Another  example  is  glaciers,  which  are  of  interest  in  numer¬ 
ous  disciplines,  e.g.,  meteorology,  climatology,  geophysics, 
etc.  Glacial  structure  lends  itself  well  to  the  problem  of  inverse 
scattering  for  a  discrete  lossy  systems,  for  they  may  be  mod¬ 
eled  as:  1)  a  single  homogeneous  layer  of  infinite  horizontal 


extent;  or  as  2)  a  multilayer  model  of  homogeneous  layers 
having  parallel  interfaces;  both  models  being  terminated  by  a 
half-space,  e.g.,  rock  or  water.  When  the  glacier  is  modeled 


as  a  single  homogeneous  slab  [Fig.  1(a)],  the  reflection  due 
to  the  glacier/bedrock  interface  has  strength  determined  by 
the  power  reflection  coefficient  (PRC)  and  the  mean  dielectric 
absorption  (B)  of  the  glacier — the  total  radar  attenuation  in 
dB  in  a  glacier  of  thickness  (z)  is  2Bz  -F  PRC  [7]. 

However,  glaciers  tend  to  have  intervening  layers  of 
'  ^tent  density',  structure,  and  impedance  characteristics, 
''''hich  also  attenuate  the  probing  signal  and  produce  multiple 


reflections.  These  intervening  layers  may  be  fim  (pack  snow), 
sea  ice,  ice  lenses  (higher-density  frozen  rainwater),  or 
moraine.  Ignoring  the  intervening  layers  and  their  multiple 
reflections  leads  to  poor  estimation  of  the  glacier's  actual 
properties,  e.g.,  thickness.  Thus  the  multi-layer  glacier  model 
[Fig.  1(b)]  characterizes  the  physical  system  more  effectively. 
We  will  show  that  the  multiple  reflections  provide  additional 
information  and  can  be  used  to  reconstruct  the  thickness,  wave 
speed,  and  loss  of  all  the  layers  of  the  medium. 

Historically,  glacial  stratification  has  been  measured  using 
seismic  sounding.  In  the  late  1960’s  an  alternative  to  seismic 
sounding  was  developed  using  pulse  (or  impulse)  radar.  This 
method  is  known  today  as  radio  echo  sounding  (RES).  The 
absorption  of  radio  waves  by  ice  is  sufficiently  low  to  make 
RES  a  feasible  probing  technique.  In  fact,  the  most  widespread 
application  of  RES  is  the  measurement  of  glacier  thickness  [8]. 
RES  has  been  found  to  produce  data  which  complements  that 
of  seismic  exploration.  For  example,  seismic  wave  velocities 
increase  with  density,  while  electromagnetic  wave  speed  is 
independent  of  density.  Also,  the  variation  in  electromagnetic 
loss  with  temperature  is  greater  than  the  variation  of  acoustic 
loss.  Thus  RES  is  becoming  an  increasingly  popular  method 
of  glacier  investigation. 

RES  systems  operate  at  lower  frequencies  (60-600  MHz) 
than  conventional  radar.  This  enables  greater  penetration  than 
is  possible  at  higher  frequencies  [9],  In  addition,  given  the 
longer  wavelengths,  interface  roughness  and  slope  are  less 
significanL  so  the  layered  model  is  more  accurate.  Early  RES 
of  glaciers  was  performed  at  a  center  frequency  of  620  MHz 
[10],  More  recent  woric  has  used  frequencies  from  60  MHz 
to  440  MHz  [11]  and  even  as  low  as  1  MHz  for  increased 
penetration  [12], 

Pulse  radar  operates  on  the  simple  principle  that  pulses  are 
not  only  partially  absorbed  within  each  ice  layer,  but  also 
partially  reflected  when  an  interface  of  dissimilar  layers  is 
encountered.  The  time  it  takes  a  pulse  to  travel  to  and  from  an 
interface  is  T„  =  where  is  the  depth  of  the  interface 
and  Vn  is  the  medium’s  velocity  of  electromagnetic  wave 
propagation  (wave  velocity).  Due  to  the  complicated  layer 
structure  of  glaciers,  a  priori  knowledge  of  the  medium’s  wave 
velocity  {r;„}  is  limited  to  the  first  layer.  Yet  the  reconstruction 
depends  on  knowing  for  each  layer.  Several  techniques  are 
used  in  practice  to  measure  in  glaciers,  e.g.,  interferometry 
technique,  sounding  next  to  drill  holes  of  known  depth,  and 
oblique  reflection  sounding.  Oblique  reflection  sounding  is  a 
standard  technique  of  exploration  geophysics,  commonly  used 
for  seismic  measurements,  in  which  the  variation  of  the  travel¬ 
time  for  a  bottom  reflection  is  measured  as  the  transmitter  and 
receiver  are  separated  along  the  surface. 

We  show  that,  by  applying  a  discrete  lossy  model,  one¬ 
sided  probing  at  two  angles  can  be  used  to  reconstruct  the 
parameters  (wave  velocity,  thickness  and  loss)  of  each  ice  shelf 
layer  without  having  a  priori  knowledge  of  the  wave  speed  for 
each  layer.  Previous  inverse  scattering  techniques  have  also 
required  a  priori  knowledge  of  the  bottom  half-space,  or  a 
perfectly  reflecting  surface  at  the  final  interface,  to  perform 
reconstruction  [2].  We  show  that  complete  reconstruction 
can  be  performed  without  these  constraints  by  presenting  a 
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Fig.  2.  TE  wave  geometry. 


numerical  example,  in  which  oblique  probing  of  an  ice  shelf 
[Fig.  1(b)]  is  simulated. 

m.  Asymmetric  Wave  System  for  Lossy  Media 

We  quickly  review  the  derivation  of  the  asymmetric  wave 
system  for  electromagnetic  wave  propagation  in  lossy,  strati¬ 
fied  dielectrics  [5].  Equation  (8)  that  follows  will  be  vital  in 
subsequent  sections. 

A.  Basic  Equations 

The  time-harmonic  form  of  a  propagating  electromagnetic 
wave  consists  of  orthogonal  electric  {E  =  and 

magnetic  {H  =  field  vectors  having  amplitudes 

{E  =  Ee-'"‘}  and  {H  =  respectively,  and  fr^uency 

{w}.  The  propagating  waves  {E,  H}  each  have  a  temporal 
{wf}  and  spatial  {k  ■  r}  component,  where  k  =  kxX  -h  kyy  -f 
fcjZ  is  the  propagation  vector  and  r  =  xx  -I-  yy  -I-  2Z  is  the 
displacement  vector.  The  magnitude  of  the  propagation  vector 
is  the  wavenumber  or  propagation  constant  jfc  =  |k|  =  ujy/JH; 
we  note  that  k  =  ^kl  -H  -I-  Jfc?. 

In  general,  the  dielectric  constant  {e''  =  is  complex 
and  may  be  expressed  as  e’’  =  e'  -  je",  where  e'  =  relative 
permittivity  and  e"  =  loss  factor  (eq  =  permittivity  of  free 
space).  Alternatively,  we  may  write  e’’  =  s'  -  j^,  where 
a  =  conductivity.  A  lossless  and  lossy  medinm  are  specified 
by  real  and  complex  e’",  respectively.  We  make  the  common 
assumptions  that  the  dielectric  is  simple  (i.e.,  stationary,  linear 
and  isotropic)  and  nonmagnetic  (i.e.,  p  =  =  permeability 

of  free  space). 

An  infinite-extent  planar  TE  wave  is  considered.  The  plane 
wave  is  incident  on  the  layered  dielectric  at  an  angle  So  relative 
to  the  direction  z,  which  is  normal  to  all  dielectric  boundaries 
(Fig.  2).  The  layered  dielectric  is  defined  as  a  stack  of  parallel 
dielectric  slabs  having  infinite  lateral  extent  From  the  TE 
condition,  the  electric  field  lies  completely  perpendicular  to  the 
plane  of  incidence;  this  is  also  known  as  being  perpendicularly 
or  horizontally  polarized. 

The  TE  electric  field  is  given  by  E  =  yE^e  = 
yEyC  and  thus  the  homogeneous  wave  equation 

becomes 
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Fig.  3.  Wave  definitions  in  layered  dielectrics. 
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Solving  ( 1 )  in  the  nth  dielectric  layer  results  in  a  solution  for 
Eyn  of  two  energy-normalized  wave  components,  one  in  the 
-Fz  direction  {!)„}  and  the  other  in  the  -z  direction 
(see  Fig.  3). 

The  waves  and  t/„  in  the  nth  layer  are  defined  from  the 
tangential  fields  and  H^n  [19]  as 


(^/s^COS^n)  ~  tiobixn 

(2a) 

-F  (v^cosl9„)  "‘rjoHxn 

(2b) 


where  po  =  y  ^  =  3770  is  the  intrinsic  impedance  of  free 

space  and  =  uiy/poe^,  k^r,  =  fc„sin^„  =  fcosinlJh  (using 
Snell’s  law)  and  =  k^  cos  6n  or  equivalently,  = 
{w'^poSn  -  fcosin^flo)^  (using  the  consistency  condition). 
Note  that  the  x-dependence  is  entirely  contained  in 
which  is  why  plane  wave  probing  at  nonnormal  incidence  can 
be  reformulated  as  plane  wave  probing  at  normal  incidence. 


B.  Interface  Effects 


We  now  find  the  relationship  between  the  waves  { ,  Un } 
in  one  dielectric  layer  and  those  CCn+i}  in  the  next. 

By  the  continuity  of  tangential  electric  and  magnetic  fields  at 
the  interface  (z  =  dn+i),  the  field  components  {En+i,  (7„+i} 
just  below  the  boundary  are  related  to  components  {Z>^,  K) 
just  above  the  boundary  (see  Fig.  3)  by 


Dn+l 

Un+l 


n+l 


I  fln,n+l 

[“■Rn.n-t-l  1 


^^,n+l  — 
= 


y/e^  coe  On  -  y/e^  cos  gn-n 
y/s^  cos  6n  -F  y/Sn+l  COS  ’ 
y^En-HCOStfn  -  y/e^COB6„+t 
y/en+\  cosOn  -F  y/e^coe9„+i.' 


(3a) 

(3b) 

(3c) 


At  any  interface  of  dissimilar  dielectrics,  both  components 
of  a  propagating  electromagnetic  wave  encounter  impedance 
mismatches.  At  the  internee  a  pwtion  of  the  incident  wave 
win  be  reflected,  while  the  remainder  will  be  transmitted.  The 
portion  reflected  is  determined  by  the  TE  Fresnel  reflection 
coefficient  {En,n-n}  (3b);  for  a  vertically  polarized,  parallel, 
or  TM  wave,  the  Fresnel  reflection  coefficient  is 


C.  Homogeneous  Layer  Effects 

Within  a  homogeneous  dielectric  layer,  the  reflection  coeffi¬ 
cient  is  zero,  so  that  the  relation  between  the  waves  { .  ( '^ } 
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at  the  bottom  of  the  nth  layer  and  those  t/^}  at  the  top 
of  the  nth  layer  is  given  by  the  propagation  effect  through  the 
layer,  where  -  dr,  is  the  thickness  of  the  nth  layer. 

For  a  lossy  layer,  the  propagation  constant  {/cn}  is  complex. 
Specifically,  the  propagation  constant  in  the  ±z  direction  is 
aiven  by  {k^r,  =  -  jQ.-n}  and  is  the  complex  sum  of 

attenuation  (a^n }  and  phase  {  J^n}  constants.  We  thus  have 


p.  Combined  Ejfects 

Define  A„+i  =  and  the 

scaled  waves  Dn  =  and  U„  =  Une~-^^.  Then,  using 

the  commutativity  of  multiplication  of  diagonal  matrices,  (3a) 
and  (4)  can  be  combined  into 

[Dn+i]  _  0  1 


0 

e 

1 

V^i  - 

Kn  +  l 

g->»3. 

n  An— Qj, 

0 

> 

0 

0 

g+A„ 

1  +  l 


p  J  ^  z  n  z  n 


oi  rD„ 

"  u„ 


1 


1  J 

p'~'  j ^ zn  n 

0  J  [U„J  ■ 

The  first  matrix  is  the  transfer  matrix,  the  second  matrix  is  the 
time-delay  matrix,  and  their  product  is  the  layer  matrix. 

£.  Discrete-Time  Formulation 

If  we  make  the  assumptions  that  the  dielectrics  are  materials 
low  in  loss  {|a;e„|  >  cr„}  and  relatively  independent  of 
ftequency  {£„(a;)  «  £„},  the  following  approximation  can 
be  made 

JUJr/Hoeoy^COS  Bn 

-  jPzn  +  Qzn  !=  (—  +  COS^n 

VO^n  f 


and  a. 


where  =  (^o£o£n)~^  is  the  wave  speed  in  the  nth  layer 
fin  =  Vn/cos^„  is  the  phase  velocity  for  the  nth  layer 
the  ±2  direction.  We  introduce  the  following  ffequency- 
“xiependent  transfer  matrix  coefficients  s„^„4.i}: 

Sn,n  +  1  =  Rn.n^l^  2A„»i  ^  (7b) 

^ote  that  the  product  of  phase  constant  and  traveled 

'Stance  {A„}  may  be  written  in  terms  of  the  phase  velocity 


{t>zn}r  i-2-1  0znt^n  —  A„  —  ulTm  [13],  where  Tzn  is 
the  travel-time  in  the  {±i}  direction  for  the  nth  layer.  We 
also  assume  that  layer  travel  distances  {An}  are  such  that 
^  =  T-n  =  UnT,  i.e.,  the  travel-time  (J^n}  is  an  integer 
{nn}  multiple  of  a  very  small  time  increment  {T}  (in  the 
example  of  Section  VI  we  use  0.5  nsec).  Replacing  the  Founer 
kernel  in  (5)  with  the  c-transform  kernel,  i.e.,  e-'"  =>  c, 
and  scaling  T  =  1  without  loss  of  generality,  we  obtain  the 
discrete-time  expression  of  the  layer  matrix: 


Dn4-l(2) 
Un  +  i(z)_ 


The  significance  of  (8)  is  twofold.  First,  the  unequal  transfer 
coefficients  model  losses  in  the  medium. 

Second,  the  explicitly  discrete-time  formulation  we  have  pro¬ 
vided  (emphasized  by  our  use  of  the  z-transform)  means 
that  digital  signal  processing  (DSP)  techniques  can  be  used 
to  obtain  exact  answers.  For  example,  we  are  not  using  the 
discrete  Fourier  transform  (DFT)  as  an  approximation  to  the 
continuous  Fourier  transform,  as  is  often  the  case.  The  DFT 
will  give  the  exactly  correct  answer  for  our  problem,  since  the 
measured  data  and  the  assumed  model  of  the  medium  are  both 
discrete.  Our  explicitly  discrete  formulation  solves  the  problem 
of  reconstructing  a  discrete  layered  medium  exactly  using 
DSP  techniques  for  their  own  sakes,  not  as  approximations  to 
continuous  equations.  Note  that  (8)  is  analogous  to  equations 
describing  asymmetric  lattice  filters  in  DSP  theory. 


F.  Inverse  Scattering 

Let  r(n)  be  the  discrete-time  impulse  reflection  response 
of  the  dielectric  medium,  which  consists  of  a  stack  of  layers 
in  each  of  which  wave  propagation  is  described  using  (8).  Let 
f(n)  be  the  impulse  reflection  response  of  the  adjoint  medium, 
a  fictitious  medium  which  consists  of  a  stack  of  fictitious  layers 
in  each  of  which  wave  propagation  is  described  using  (8)  with 
r„,„+i  and  s„,„+i  exchanged  (of  course,  f(n)  is  not  directly 
measurable,  but  we  will  never  need  to  measure  r(n);  see 
Section  rV).  In  [5],  it  has  been  shown  that  r„,„+i  and 
are  the  reflection  coefficients  computed  when  the  following 
asymmetric  Toeplitz  system  is  solved  using  the  asymmetric 
Levinson  algorithm: 


■  1 

r(l) 

r(2) 

f(l) 

1 

r(l) 

f(2) 

r(l) 

1 

.f(n) 

0 

■  0 
0  T„ 


f(l) 


■an(0) 

an(l) 

.an(n) 


bn(n) 

bn(l) 

6n(0) 


(9) 


where  the  asymmetric  Levinson  algorithm  is 
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TABLE  I 

Layered  Dielectric  Reconstruction  from  Transfer  Coefficients 


Lossless  Dielectric 

Lossy  Dielectric 

Given:  /?.  ,,, ,  T, 

Given:  s..„,.  T. 

Find:  £',  A, 

Find:  A,,  a. 

Step  1.  Initialize  with  e' =  1,  n  =0. 

1.  Initialize  with  £5  =  1,  Lj  =  1,  /i  =  0. 

T  c 

Step  2.  A.  =  c=  speed  of  light 

2.  A  ~ 

'  W. 

Stcp3.  ^  = 

(since  the  loss  factor  is  always  positive) 

Step  4.  Increment  n,  return  to  Stqp  2. 

5.  4=-^  =>  o.=^lnii- 

2A. 

6.  Increment  n,  return  to  Step  2. 

Asymmetric  Levinson  Algorithm 

Step  I:  Initialize:  n  =  0,  Aq{z)  =  Bo{z)  =  I,  and  tq  =  1. 
Step  2: 


n 


_  V^«in,ir(n+ 1  -  i)  ^bn.ir(t  +  l) 

rn,n+l  -  -2^ - - - S„,„+i  =  >  — ^ 

«=0  "  i=0 


Step  3: 

An+l{z)  _  _ ^ _ 

Bn  +  l{.z)  _  \/l  - 


X 

1  ®Ti,n+l 

- 1 

0 

^  ^n,n-fl  1 

[  0  z+ij 

L5n(^) 

Step  4.  Tn+\  —  TVi^/I  T n,n+lSti^n+i 
Step  5:  Increment  n,  return  to  Step  2. 

We  will  not  discuss  this  further  here,  although  the  similarity 
of  the  wave  system  (8)  and  the  recursion  (Step  3)  should 
convince  the  reader  there  is  some  connection.  The  asymmetric 
Schur  algorithm  may  be  used  in  place  of  the  asymmetric 
Levinson  algorithm;  this  is  advantageous  in  a  parallel  com¬ 
puting  environment. 


cumulative  loss  factor  {£„  =  }  of  the  system  up  to 

this  boundary.  For  the  adjoint  medium,  r„,„+i(8„,„+i)  is  the 
quotient  (product)  of  Rn  ^+i  3nd  L„.  For  normal  TE  wave 
propagation  [di  =  0}  and  {a^i  =  Oi},  for  all  t. 

Lossless  medium:  r„,„+i  =  /L,,„+iL„  = 

~  Sn,n+1  (IOR) 

Lossy  medium:  rn,„+i  =  8n,„+i  = 

^  fn,n+l*n,n+l  —  (10b) 

where 


Bn,n+1  — 


~  \/En+l 

L„  = 


(lla) 

(lib) 


The  time  interval  {7^}  is  the  two-way  propagatkm  tnwf.  in  the 
nth  layer  and  it  is  given  by  the  intmval  between  the  and 
transfer  coefficients.  From  the  transfer  coefficients  and 
time  intervals,  the  medium  can  be  axiqtletely  reconstructed 
as  follows; 


G.  Reconstruction  of  Medium  from  and  Sn.n+i 

A  very  important  issue  that  arises  when  the  reflection  data 
are  noisy  is  distinguishing  actual  interfaces  from  noise  spikes 
in  the  reflection  data  that  could  be  interpreted  incorrectly  as 
interfaces.  Several  authors,  e.g.,  [15],  have  proposed  threshold¬ 
ing  the  reflection  data,  so  that  a  data  value  above  the  threshold 
is  interpreted  as  an  actual  reflection,  while  a  data  value  below 
threshold  is  regarded  as  noise  and  set  to  zero.  This  af^rroach 
has  proven  quite  promising  for  lossless  media,  and  we  propose 
to  use  it  for  lossy  media  as  well  (see  Section  VT). 

The  transfer  coefficient  rn,n+i(s„,„+i)  for  the  real  medium 
is  the  product  (quotient)  of  the  reflection  coefficient 
at  the  boundary  between  the  nth  and  (n  -f  l)th  layers  and  the 


rv.  Reccwstruction  Using  Two  Anch£s  of  Incidence 

In  [5],  it  has  been  shown  tiiat  the  in^Hil^  reflection  and 
transmission  reqxxises  of  a  lossy  layered  medium,  fimm  both 
sides  of  the  medium,  can  be  used  to  dtierrame  r(n)  and  f(h). 
However,  in  many  ^iicatitns  (e.g„  RES  of  a  glacfer)  it  is  not 
practical  to  probe  a  merfium  fitan  both  sides.  Thus,  it  behooves 
us  to  find  a  method  fw  recans&iicting  a  lossy  medium  purely 
from  the  reflection  re^xmse  cm  one  side.  Pr^ous  work  [14] 
has  shown  that  tije  cme-sided  re&ction  response  is  sufficient 
to  reconstruct  a  lossless  medium.  In  this  section,  we  show  that 
the  one-sided  reflection  responses  at  two  angles  of  incidence 
results  in  sufficient  information  to  reconstruct  a  lossy  medium. 
In  practice,  an  impulsive  point  source  would  be  used;  such 
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a  source  can  be  decomposed  into  impulsive  plane  waves  at 
vanous  angles  of  incidence  using  the  Weyl  integral,  and  plane 
wave  reflection  responses  at  different  angles  can  be  separated 
by  beam  forming.  The  use  of  an  impulsive  plane  wave  at  a 
specific  angle  {^Iq}  allows  us  to  reformulate  2-D  problems  as 
I-D  problem  and  use  the  formulation  described  in  Section  III. 


affect  the  interface  between  layers.  We  were  able  to  recover  J, 
in  a  layer  by  looking  at  the  interx’al  between  f,_i  ,  and  f, 
(note  that  the  interval  is  equivalent  to  T,  since  the  multiple 
reflections  have  been  suppressed).  However,  determining  L, 
requires  use  of  the  magnitude  of  which  cannot  yet  be 

interpreted. 


,4.  Reconstruction  of  Permittivities  from  Time  Intervals 

We  begin  by  assembling  the  normal  incidence  reflection 
response  {r(n)}  into  a  symmetric  Toeplitz  system,  i.e.,  let 
f(n)  =  r{n)  in  (9).  Running  the  symmetric  Levinson  algorithm 
on  this  system  results  in  a  single  set  of  nonzero  transfer  coef¬ 
ficients  which  occur  at  time  intervals  {T.}  (at  other 

times  the  transfer  coefficients  are  zero).  Note  that  the  Levinson 
recursions  are  necessary  to  suppress  multiple  reflections  due 
to  previous  layers:  this  is  called  predictive  deconvolution. 
Since  the  medium  losses  are  not  properly  accounted  for,  these 
coefficients  {f;  t+j}  do  not  have  the  same  values  as  the  true 
transfer  coefficients  Si,,+i}.  However, 

and  st  i+i  will  all  occur  at  the  same  times,  permitting  recovery 
of  the  intervals  {T;}.  If  the  medium  has  low  losses,  the 
Levinson  recursions  will  reduce  (but  not  eliminate,  since 
7^  r,  i+i)  the  multiple  reflections  enough  to  enable 
easy  determination  of  the  interfaces  directly  off  the  primary 
reflections. 

Next,  the  medium  is  probed  with  an  impulse  at  some 
nonzero  angle  {^o}.  resulting  in  the  reflection  response 
{r'(n)}-  Assembling  this  response  into  a  symmetric  Toeplitz 
system  (9),  and  running  the  symmetric  Levinson  algorithm, 
produces  a  different  set  of  transfer  coefficients 
These  transfer  coefficients  occur  at  time  intervals  {T'}.  The 
time  intervals  for  normal  {T;}  and  oblique  {T'}  sensing  are 
proportional  to  the  wavespieeds  {vi  and  v'^,  respectively}  in 
each  layer,  and  are  related  to  the  angle  of  travel  {^i}  (6)  by 


v',  = 


T' 


Vi 


cos  Ot 


=  cos 


T' 

-1 

Tf 


(12) 


Thus  from  the  two  sets  of  time  intervals  {T„T[],  the  angle 
of  propagation  in  each  layer  [Si]  can  be  calculated.  Using  the 
low-loss  assumption  (6)  {0i  a  A:o  y^},  SneU’s  law  relates  the 
angles  of  propagation  in  all  layers  by  y/F-smdi  =  T^sin^j  . 

Since  the  angle  of  incidence  {^o}  and  permittivity  {eo}  of  the 
medium  from  which  probing  takes  place  are  known  (e.g.,  for 
air  Co  =  1),  the  permittivity  of  each  layer  {e,}  can  be  found 
recursively.  From  the  permittivity  {ej  and  travel-time  {T,}, 
the  thickness  of  each  layer  is  determined  by  A,  = 

Once  the  permittivity  of  each  layer  is  known,  Ae  re¬ 
flection  coef&ient  at  each  boundary  can  be  calculated  as 


We  now  introduce  a  novel  method  of  reconstruction  which 
involves  running  the  Levinson  algorithm  backward  to  generate 
a  synthetic  response  {f(n)}.  We  begin  by  simulating  the 
response  for  a  lossless  version  of  the  medium,  i.e.,  i'  =  s' 
but  i”  =0.  This  response  is  generated  using  the  backward 
asymmetric  Levinson  algorithm  with  =  f,  ,+  i  =  s, 

The  backward  asymmetric  Levinson  algorithm  is  as  follows: 

Backward  Asymmetric  Levinson  Algorithm 

Step  1:  Initialize:  n  =  0,  .4o(z)  =  So(c)  =  1,  and  tq  =  1, 

Step  2: 


+  l  (-2^) 

1 

®n.n-rl 

Rn  +  l{z)_ 

yr 

fn.n  +  l 

1 

r  1 

2  2 

0 

An{z) 

0 

.Bn{z)_ 

n 

Step  3:  f(n-bl)  =  f„,„+iT„  -h  4-1  -  f) 

n  — 1 

r(n  -b  1)  =  -b  1). 


1=0 _ 

Step  4:  r„+i  = 

Step  5:  Increment  n,  return  to  Step  2. 

If  the  generated  response  {r(n)}  found  with  = 

U.i+i  =  is  equal  to  the  measured  response  {r(n)}, 

then  we  have  just  verified  the  medium  to  be  lossless.  If  on  the 
other  hand  f{n)  ^  r(n),  then  we  have  verified  the  medium 
to  be  lossy  and  the  values  for  and  must  be 

updated.  When  the  medium  has  loss,  the  measured  returns 
{r(n)}  are  less  in  magnitude  than  those  just  generated  {r(n)}, 
i.e.,  |r(n)j  <  |f(n)|.  We  note  that  the  first  nonzero  return 
{r(no)}  is  due  to  reflection  off  the  first  interface,  and  that 
its  magnitude  is  determined  solely  by  that  boundary’s  actual 
tranter  coefficients  {ro.i},  whereas  in  our  simulated  medium, 
the  first  nonzero  return  {f(no)}  is  determine  by  the  first 
boundary’s  reflection  coefficient  {i?o,i}-  Recall  that  we  know 
Ro.i  from  (3b),  since  we  reconstructed  the  permittivities  from 
Sn  the  interval  information.  Since  the  transfer  and  reflection 
coefficients  are  related  through  the  loss  factor  {Lq},  Lq  can 
be  found  using 


^.1+1  =  ^^7= — 7= 

C09  e,  _  ,/?7;rr  cos  e. 


for  normal  incidence  and  iZ' 
for  oblique  incidence. 


r  r{no) 
To  =  - r 

n^o) 


ro.i  =  Ro,iLq  and  so.i  = 


^,1 


(13) 


N.i-FI  — 


/Reconstruction  of  Loss  Factors  from  Reflection 
/Response  Values 

(-'Sing  reflection  responses  at  more  than  two  angles  does  not 
^ovide  sufficient  additional  information  to  compute  L,  and  a,. 
e  reason  for  this  is  that  L,.  like  the  travel-times  T,.  do  not 


Note  that  we  have  now  found  the  exact  transfer  ctKfficients  for 
the  first  boundary.  Since  the  loss  in  the  first  layer  is  reflected  in 
the  subsequent  transfer  coefficients,  we  can  now  generate  trial 
values  for  the  remaining  lossy  medium  transfer  coefficients 

{fj.i+i ,  Si,n-i }: 


fi.i-ri  R^.i-riLo  and 


R.. 


Vz  6  fO.  1  -  ll. 


14) 
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To  find  the  losses  in  the  subsequent  layers,  we  generate  two 
responses  using  the  backward  asymmetric  Levinson  algorithm. 
The  first  response  {f(Ti)}  is  our  new  synthetic  medium  impulse 
response  generated  by  using  the  current  trial  values  for  the 
transfer  coefficients  for  all  layers  e 

The  second  response  {f(n)}  is  generated  by  using  only  the 
coefficients  Si.i+jVi  G  [0,p-  1]}  that  are  known  ex¬ 

actly  after  the  first  p  layers  have  been  identified;  the  remaining 
coefficients  are  set  to  zero.  We  once  again  compare  the  trial 
{r(n)}  with  the  measured  {r(n)}  response.  Any  difference 
indicates  incorrect  values  of  the  trial  transfer  coefficients. 

Let  f(np)  and  r(np)  differ,  indicating  that  our  trial  values 
for  the  transfer  coefficients  {fp  ^+j,  Sp,p+i}  are  incorrect.  We 
correct  the  coefficients  as  follows: 

r(np)  -  f(np)  (15) 

andWe  subtract  f(n)  firom  the  measured  {r(n)}  and  trial 
{f(n)}  responses  to  remove  the  effects  of  multiple  reflections 
up  to  and  including  time  rip  due  to  the  layers  previously  deter¬ 
mined.  A  new  layer  has  now  been  identified,  and  we  update  the 
trial  values  for  the  transfer  coefficient  for  subsequent  layers, 
using 

^*,«+i  ~ -(^.i+iT/p  and  i  ,Vi  G  — 1]. 

(16) 

One  other  important  point  should  be  noted  here.  The  first 
stage  of  our  algorithm  uses  the  symmetric  Levinson  algorithm 
to  reduce  multiple  reflections  and  identify  depths  (in  travel 
time)  of  interfaces.  However,  the  symmetric  Levinson  algo¬ 
rithm  incorrectly  assumes  a  lossless  medium.  As  a  result, 
multiple  reflections  will  not  be  eliminated,  although  if  the 
medium  losses  are  low  they  will  be  greatly  suppressed,  and  if 
the  medium  losses  are  high  they  will  be  very  small  anyway. 
However,  any  significant  multiple  reflections  may  be  misiden- 
tified  as  primary  reflections  from  (ntmexistrat)  inteffaces. 

If  we  know  a  priori  that  contiguous  layers  have  signifi^ 
candy  different  properties,  simply  threahnldmg;|h>-  refiection 
coefficients  generated  by  the  symmolic  levQqM;alsocithBa 
will  distinguish  the  primary  reflectioQS 
reflections.  Otherwise  we  can  check  tbe^coaiuieacy: -e#  the 
result  of  each  recursion,  again  by  con^Moag  die  trial 
with  the  measured  {r(n)}  responses.  Any  (^neaces  for 
n  <  Tip  indicates  mistdi^ificaticm  trf  an  interface.  This  is 
another  reason  for  this  recursive  use  of  layer  stripping,. 

C  Summary  of  Algorithm 

The  reconstruction  method  few  the  disciete  kJa^  dielectric 
is  as  follows; 

Reconstruction  of  Lossy  Dielectric  Using  Tm>  Ar^gb3: 

Step  1:  Measure  irrgmlse  reflection  responstii^  ■ftmUed 
Mn)}  and  oblique  {r'(n)}  probing. 

Step  2:  Structure  each  response  into  a  symmetric  Toeplitz 
system  (9);  solve  this  system  using  symmetric  Levinson  al¬ 
gorithm. 

Step  3:  Determine  time  intervals  {Ti,T',}  between  coeffi¬ 
cients  {f.f'}. 


Step  4:  9i  =  c(x  ^  Vt  G[l,  t-1] 

Step  5:  eo  and  6i,=ir  ^.  =  G[l,  t-\]. 

Step  6:  =  s.,i+i  = 

Step  7:  Initialze:  p  =  0,  L_i  =  1,  L'_j  =  1. 

Step  8:  Using  backward  asymmetric  Levinson  algorithm 
generate: 


•  r{n)  using  transfer  coefficients 

with  Ti.i+i  =  Si,,+i  =0,  Vi  G  [p,  f  -  1] 

•  f(n)  using  transfer  coefficients 


G  [0,t-  1] 


Step  9  : 


Lp 


^Lp-t 


r(np)-f(np) 


COS  dp 


Step  JO:  Tp^p^i  —  Rp^p-^\Lp  and  8j),p+i  = 

=  Ri,i+1.L,  Vt  e  [p-l-1,  t-1] 

Step  li:  Incament  f^  retam  to  Step  8.  ^ 

Step  12:  mth  exact  coefficients  s^^+i.Vi  G^  f-1], 
implemera  lossy  system  reconstruction  in  Action  I^fG. 

t 

-4 

D.  Special  Case:  Law~Loss  Medimt 


A  low-loss  merfium  is  tfeSoCd  as  one  in  which  the  attenu¬ 
ation  constants  in  all  layers  and  the  reflection  coefficients  for 
aU  interfaces  are  both  verjrsmal),  i.er,  af  <1  and  <1. 
Under  these  conditions,  the  ftfllowing  approximation  is  vaBd: 


“d  a  Vi,  j  G  [04  -  1). 


(17) 


With  this  approximaritm  the  rec(»stnKtieo  the  medium 
it  gteariy  ?  a  ew* 


Rneonstmcaion  , 


Sttp^  H  to  norntd 


{r(«»  and  ;  2  ,; 

Step  2:  Toeplitz 

symm  {9}:  s0l^  t^  s^sfe^  tai^^liilKn^^  al- 


Note,  Steps  1-5  are  the  same  as  in  Section  IV-C,  but  re¬ 
peated  use  of  the  backward  asymmetric  Levinson  algorithm  to 
generate  synthetic  reflection  responses  is  no  longer  required. 
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V.  Reconstruction  of  Unknown  Bottom  Layer 

The  methods  presented  so  far  are  unable  to  reconstruct  the 
dielectnc  constant  of  an  unknown  bottom  half-space  (the  fth 
layer).  This  situation  is  often  the  case,  e.g.,  RES  probing  of 
an  ice  shelf.  Since  the  fth  layer  is  semi-infinite,  no  returns  are 
available  from  which  to  calculate  the  normal  and  oblique  travel 
times  {Tf.  r/}.  Without  this  information,  we  cannot  calculate 
the  angle  of  propagation  {0,}.  the  permittivity  {cj},  and  the 
reflection  coefficient  Without  the  reflection  coeffi¬ 
cient.  the  transfer  coefficients  from  which  the 

attenuation  coefficient  of  the  (f— l)th  layer  {Qt_i }  is  obtained, 
cannot  be  calculated. 

We  now  show  that  by  using  two-angle  probing,  we  have 
sufficient  information  for  complete  reconstruction  of  the  lay¬ 
ered  medium,  including  the  bottom  half-space.  The  desired 
quantities  {e[  and  Qt_i}  may  be  found  from  the  normal  and 
oblique  reflection  responses  {r(n),r'(n)}  as  follows.  From 
the  procedure  given  in  Section  fV,  the  following  are  known; 

•  permittivity  of  the  (f- 1  )th-layer 

•  angle  of  propagation  in  (f- l)th-layer 

•  thickness  of  the  (i-l)th-layer 

•  loss  facton  to  the  t-1,  t-  \  interface  {Lt-2,L[_2) 

•  transfer  and  transmission  coefficients  {rt  i+i,  Si  t+i, 
r,+i,  Vi  €[0,  f-2]). 

Since  we  have  no  information  regarding  the  fth  layer,  we 
begin  by  setting  the  last  reflection  coefficient  to  zero,  i.e., 
=  O'  Then,  using  the  known  set  of  transfer  coefficients 
(rui+i-Si  t+i.  Vi  €[0,  t—2]],  the  synthesized  normal  response 
{f(n)}  will  equal  the  measured  response  {r(n)}  up  to  time 
increment  rit-i.  At  rit-i,  the  measured  response  contains 
reflection  data  from  the  last  interface  (i-1,  t],  while  the 
synthesized  responses  do  not. 

Using  the  Levinson  algorithm,  transfer  coefficient  for  the 
next  layer  }  can  be  found  from  the  measured  {r(nt_i)} 
and  estimated  {f(nt_i)}  normal  responses  as 

-  r(n«_i)).  (18) 

‘rit-i 

Use  the  algorithm  of  Section  FV  to  determine  the  oblique 
transfer  coefficients  {r',,,+i,s'i,i-fi,Vi  €  [0,f  -  2]}.  Using 
these  coefficients  and  appropriate  time  delays  {f'j,Vi  6  [0,f- 
1]}  we  generate  an  estimate  {r'(m)}  of  the  oblique  response. 

As  in  (18),  the  oblique  transfer  coefficient  for  the  last 

layer  is  found  from  the  measured  {r'(mt_i)}  and  estimated 
responses. 

Note  that  the  time  indices  nt_i  for  the  normal  probing  re¬ 
sponses  r(nt_i)},  and  mt-i  for  the  oblique  probing 

responses  {r(7nt_i), r'(mf_i)},  are  not  the  same.  However, 
they  are  indicative  of  the  same  interface,  i.e.,  the  boundary 
between  the  ((— l)th  and  fth  layer.  Hence 

f  -  r'{mt-i)).  (19) 

'm,_i 

eeall  from  (3)  and  (15)  that  the  transfer  coefficients  for 


air 

fim  1 

fim 

1 

k^e  1 

ice 

sea  ice  1 

seawater  | 

bedrock 

ice  shelf  glacier 

Fig.  4.  Ice  shelf  model. 


normal  {rt-i.t}  and  oblique  t}  probing  are  given  by 

ft-i.t  =  ^  |rt-i,£| 

_  y/^t-l  —  y/et 
V'e't-i  +  \/^ 

t-l.i  = 


-2q,_i  At_i  r 

Lit-2 


(20a) 


‘t-i.t 


C06^t_i  -  s/^t  cos  Bt 
+  ^tCosBt 

^  g-2a(_i  At_i  cos  9,_i  ^ 


(20b) 


We  now  have  two  equations  with  three  unknowns  { 
and  Qt_i}.  From  Snell’s  law,  the  angle  of  travel  in  the 
bottom  layer  can  be  found  from  its  permittivity,  i.e.,  Bt  = 
sin  ^  Using  this  in  (20b)  results  in  two 

equations  in  two  unknowns  {at-i,^}.  Solving  for  the 
attenuation  coefficient  as  a  function  of  permittivity 

{v^}  gives  the  following  parametric  equations: 

1 


cct-i  =  - 


cct-i  =  - 


2At_: 


In 


rt-i,t  ^  v/s't-i  + 
^t-2  i/St-l  - 


(21a) 


-In 


2At_i  cos^t-i  ^t-2 

^£(-1  cos^t_i  -F  y^cos  ^sin“^  sinSt_ij  j 

COS  1  —  cos  ^sin  ^  ^ ^ ^ 

(21b) 

The  solution  to  (21)  is  that  value  of  £t  for  which  at_i  =  aj_i. 


VI.  Numerical  Example 


A.  Ice  Shelf  Model 

To  demonstrate  the  above  algorithms,  we  simulate  RES 
probing  of  an  ice  shelf,  which  is  a  floating  glacial  mass 
attached  to  land  (Fig.  4).  In  addition  to  a  shallow  layer  of 
snow/fim  on  top,  there  is  a  layer  of  salt  ice  at  the  interface 
between  the  glacial  ice  and  the  sea  water.  Ice  stelves  represent 
a  stratified  system  that  is  of  particular  interest  in  glaciology, 
especially  radioglaciology,  and  in  physical  oceanography  [8]. 
Since  layers  of  fim  and  sea  ice  will  give  poor  results  for  ice 
thickness  if  the  multiple  reflections  are  neglected,  a  multi-layer 
model  must  be  used  (see  Fig.  1). 
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TABLE  n 

Ice  Shelf  Parameters  (Normal  Probing) 


Layer 

Thickness 

(m) 

Temp 

(“O 

Density 

(P  =  A) 

m 

Salinity 

e' 

2-way  delay 

(nsec) 

f" 

a 

Air 

17.3 

n!  a 

nl  a 

n/a 

1 

115.5 

0 

0 

1 

Fim 

22.1 

-25 

.55 

n/a 

2.09 

213.0 

1.26e-4 

9.1e-5 

.9%0 

Ice 

91.3 

-15 

nl  a 

n/a 

2.95 

1045.0 

.0011 

6.7e-4 

.8848 

Sea  Ice 

16.8 

-10 

n/ a 

5 

3.44 

207.5 

.0382 

2.2e-2 

.4775 

Sea  Water 

semi- 00 

1 

n/a 

10 

84.41 

00 

157 

n/a 

0 

TABLE  m 

Ice  Shelf  Parameters  (Obuque  Probing) 


Uver 

Angle  (0) 

Distance  (m) 

L  Insec) 

^-204 

Air 

30.00“ 

15.0 

100.0 

1.000 

Fim 

20.23“ 

20.8 

2O0J0 

.9962 

k£ 

16.92“ 

87.3 

1000.0 

.8896 

Sea  Ice 

15.63“ 

16.2 

200.0 

.4909 

Sea  Water 

3.11“ 

semi-« 

oe 

0 

Furthermore,  the  layer  of  salt  ice  (or  sea  ice)  is  highly 
absorbing,  compared  to  the  fim  and  ice  layers.  We  show  that 
if  the  system  is  modeled  as  a  lossless  medium,  the  bottom  is 
mistaken  for  bedrock  {e'  as  10.5}  instead  of  sea  water  {e'  = 
84.16}.  On  the  other  hand,  if  a  low-loss  model  is  assumed 
»  e"},  and  we  probe  at  two  angles,  the  ice  shelf  is 
correctly  reconstructed. 

As  shown  in  Fig.  4,  the  ice  shelf  is  modeled  as  a  multi¬ 
layer  system.  The  thicknesses  of  the  layers  are  realistic  [7]— [9], 
[11],  [12],  [16]-[18],  but  they  have  been  chosen  to  provide 
travel-times  that  are  integer  multiples  of  0.5  nsec  (this  could 
be  accomplished  by  integrating  the  continuous-tinie  radar 
return  and  sampling  every  0.5  nsec).  We  consider  a  video 
pulse  system  operating  at  100  MHz.  The  transmitted  pulses 
are  0.5  nsec  in  duration,  and  returns  are  clocked  at  2  GHz. 
The  dielectric  constant  {e  =  s'  -  ie"}  for  each  layer  was 
determined  using  a  combination  of  actual  measurements  [8], 
[9]  and  theoretical  values  calculated  using  variations  of  the 
Debye  equation  [19].  Layer  parameters  for  normal  and  oblique 
(30°)  probing  are  given  in  Tables  II  and  HI,  respectively. 

The  reflection  and  transfer  coefficients  for  each  interface 
are  shown  in  Table  IV. 

B.  Data:  Plane-Wave  Impulse  Reflection  Responses 

Using  the  transfer  coefficients  given  in  Table  IV,  and 
the  backward  asymmetric  Levinson  algorithm,  we  simulated 
the  free-surface  reflection  responses  for  normal  {r-(n)}  and 
oblique  {r'(n)}  probing.  Results  are  shown  in  Fig.  5(a)  and 
(b),  respectively. 

As  expected,  the  oblique  response  of  an  impulsive  plane 
wave  of  infinite  extent  shows  less  travel  delay  due  to  the  path 


Fig.  5.  ReflectioD  lespoDse  versus. 


length  being  shorter  by  a  factor  of  cos  0.  Also  note  the  multiple 
reflections  due  to  multiple  scattering  between  the  interfaces. 
Elimination  of  the  multiple  reflections  using  the  wave  system 
(8)  that  accounts  for  them  is  clearly  necessary.  Note  that 
methods  based  on  the  Bom  (single-scattering)  approximation 
will  incorrectly  identify  these  as  (fictitious)  interfaces.  The 
effects  of  these  errors  will  be  exacerbated  due  to  the  losses 
in  the  system.  A  multiple  reflection  incorrectly  identified  as 
a  primary  reflection  in  a  Brnn-ai^roximation-based  inversion 
method  will  have  an  incorrect  loss  factor  assigned  to  the  layer, 
and  this  introduces  more  error  in  reconstructing  deeper  layers. 


C.  Reconstruction  of  Reflection  Coefficients 

Solving  the  symmetric  Toeplitz  system  constructed  using 
the  normal  response  {r(n)}  gives  the  reflectioa  coefficients 
{f„,„+i}.  Similariy,  using  the  oblique  response  {1^(0)}  gives 
the  oblique  reflection  coefficients  {r5,  „+i}.  reflection 
coefficients  are  given  in  Table  V. 

D.  Reconstruction  Using  Lossless  Model 

The  medium  reconstmeted  under  the  incorrect  assumption 
that  the  medium  is  lossless,  using  the  lossless  algorithm 
in  Section  III-G  and  the  data  in  Rg.  5(a),  is  given  below 
in  Table  VI.  We  initialize  the  algtxitiun  using  tiie  a  priori 
knowledge  of  0th  layer.  Note  ffiat  e'  is  off  by  ^tproximately 
2%  in  layer  3  and  conq>letely  wrong  in  layer  4.  This  leads  to 
misidentification  of  layer  3  as  ffactmed  ice  instead  of  sea  ice, 
and  layer  4  as  bedrock  instead  of  sea  wato! 

E.  Reconstruction  Using  Lossy  Model 

We  now  use  oblique  plane-wave  probing,  performed  at 
an  angle  of  30°  relative  to  normal.  Recall  that  this  can  be 
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WAVE  IMPLESE  RESPONSES 


TABLE  rv 

Model  Transfer  Coefficients 


Interfece  ( n,  n -F 1 ) 

Ln 

l: 

-.1822 

1.000 

-.1822 

-.1822 

-.2207 

1.000 

-.2207 

-.2207 

-.0859 

.9960 

-.0856 

-.0956 

.9962 

-.0952 

-  0960 

-.0384 

.8813 

-.0338 

-.0436 

-.0417 

.8862 

-.0370 

-  0471 

Sea  Ice/Sea  Water  (3.4) 

.4208 

-.2795 

-1.5782 

-.6741 

.4350 

-.2932 

119 

TABLE  V 

Reconstructed  Reflection  Coefficients 


Interface  ( /i.n  -f  1 ) 

Air/Laver  1 

-.1822 

-.2207 

-.0856 

-.0952 

Laver  2/Laver  3 

-.0338 

-.0370 

Laver  3/Laver  4 

-.2794 

-.2931 

TABLE  VI 

Medium  Reconstruction  Assumpjg  Lossless  Model 


^^SB! 

e' 

liiHII 

Type 

^Rl^l 

1.00 

115.5 

17.3 

air 

1 

2.09 

213.0 

22.1 

fim 

2 

2.95 

91.3 

Klacial  ice 

3 

3.38 

207.5 

16.9 

bactuiedice 

4 

10.65 

00 

00 

bedrock  w/  water! 

accomplished  using  impulsive  point  source  reflection  duta  by 
beamforming  for  both  normal  and  oblique  incidence. 

The  angle  of  travel  in  each  layer  can  be  found  from  the 
j  ratio  of  normal  to  oblique  travel  intervals.  The  permittivity 

I  of  each  layer  is  then  found  using  SneU’s  law,  and  the  layer 

thickness  can  then  be  found  knowing  the  travel  time,  angle, 
and  permittivity.  Results  are  given  in  Table  VH-A.  Note  that 
the  permittivities  are  not  exact  since  we  have  limited  the 
time  interval  accuracy  to  0.5  nsec;  choosing  a  smaller  time 
interval  would  improve  this.  We  have  deliberately  chosen  a 
relatively  large  time  interval  to  show  that  our  method  still 
works  reasonably  well. 

From  Tables  IV  and  V,  we  note  that  the  recovered  re¬ 
flection  coefficients  for  both  normal  and  oblique  probing 
^  approximately  equal  to  the  real  transfer  coefficients,  i.e., 
^n+i  %  Tn  n+i  and  «  ^n.n-n-  Thus  we  may  use 

®e  low-loss  assumption  to  reconstruct  the  layers  (in  actual 
plication  this  would  have  to  be  known  a  priori;  otherwise 
^  full  algorithm  of  Section  IV-C  must  be  used).  From  the 
calculated  permittivities  of  Table  VII-A  and  the  measured 
cue  cients  {f'„  of  Table  V  we  can  calculate  the  loss 
ctor  {Lr,}  for  all  but  the  next  to  last  layer  using  the  low-loss 
^goiithm  of  Section  IV-D.  Results  are  in  Table  VII-B. 


TABLE  Vn-A 

Layer  Parameters  from  Two-Anole  Probing 


Layer (n) 

fn  (nsec) 

t'„  (nsec) 

8. 

-It  (m) 

0 

115.5 

100.0 

30.0 

1.00 

17,3 

1 

213.0 

200.0 

20.1 

2.10 

2 

16.9 

2.95 

91.1 

3 

.  207.5 

1S.S 

3.50 

16.6 

4 

00 

oc 

0 

n 

3C 

TABLE  Vn-B 

Calculated  Layer  Losses 

tinefface(n.n-H)  fn^+i  r„  JCTZ,  LL~ 

Air/Layer  1  ^.1822  -0.1822  LOOO  L^ 

Layer  1/Layer  2  -0.0856  -0.0859  0.9965  -0.0952  0.9967 

Uyer  2/Layer  3  -0.0338  -0.0427  0.7916  -0.0370  0.7997 

Layer  3/Layer  4  41.2794  -0.6540  -0.4272  -0.2931  0.4414 


TABLE  Vm 

Reconstruction  wim  Two-Angle  Data 


A,  (m) 

Tvpe 

0 

1.00 

17.3 

0 

air 

1 

2.10 

22.0 

8.0e-5 

flm 

2 

2.95 

91.1 

1.3e-3 

ice 

3 

3.50 

16.6 

1.9e-2 

sea  ice 

4 

>80 

00 

r)/a 

seawater 

Using  the  boldface  values  in  Tables  VII-A  and  VH-B  and 
(21a)  and  (21b),  the  remaining  unknowns  {03  and  £4}  can  be 
found.  Since  reconstruction  accuracy  is  limited  by  the  0.5  nsec 
sampling  rate,  minimizing  the  objective  function  (03  -  03 1 
does  not  produce  a  sharp  null  any  single  £4.  We  do 
however  find  that  the  solution  of  must  lie  in  the  region  £4  >80. 
Therefore,  we  conclude  the  bottom  half-space  is  sea  water. 
Using  £4  Rs  80,  we  calculate  the  values  for  03  and  J?3, 4  shown 
in  italics  in  Table  VII-B. 

Hie  final  reconstructed  values  of  tbe  ice  shelf  are  given 
in  Table  Vm  and  should  be  compared  to  the  actual  values 
in  Table  n.  Here  again,  the  knowledge  of  the  0th  layer  was  a 
priori.  We  note  that  the  final  layer  is  now  correctly  identified  as 
sea  water,  as  opposed  to  bedrock  in  the  lossless  reconstruction. 
The  permittivities  e'  for  the  first  three  layers  are  reconstructed 
exactly:  the  error  for  the  fourth  layer  is  approximatelv  2^r. 
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TABLE  K 

Reconstructed  Reflection  Coefulients  for  Low  Noise  Data 


umc  (nsec) 

130 

250 

340 

1390 

1600 

1720 

1840 

1930 

actual 

-0.1822 

0 

-0.0856 

-0.0338 

0 

0 

0 

Levinson 

-0.1834 

0 

-0.0881 

-0.0303 

0 

0 

0 

Bora 

-0.1873 

0.0360 

-0.0845 

-0.0282 

0.1012 

-0.0260 

0.0327 

U  2 
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Fig.  6.  Noisy  versions  of  normal  probing  returns  vs.  nsec. 
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(a)  (b) 

Fig.  7.  Reconstructed  transfer  coefficients  using  unmodified  reflection 


Note  that  the  algorithm  computes  the  cumulative  losses  {L„} 
from  which  the  attenuation  coefficients  {q„}  are  subsequently 
computed.  Round-off  error  can  lead  to  an  incorrect  a„,  but 
the  errors  will  not  accumulate;  note  that  03  is  computed  mote 
accurately  dian  Q2.  The  errors  in  permittivity  and  attenuation 
constants  are  due  entirely  to  limiting  the  sample  rate  to  2  GHz; 
a  finer  discretization  will  produce  better  results.  This  example 
thus  shows  that  proper  modeling  significantly  improves  the 
reconstruction  results. 

F.  Reconstruction  With  Noisy  Data 

The  layer  stripping  algorithms  used  in  this  reconstruction 
have  the  reputation  of  being  unstable  in  noise.  However,  these 
algorithms  fail  only  when  given  infeasible  data,  i.e.,  Haifa  that 
could  not  have  come  from  an  actual  medium  in  a  noise  free 
environment.  The  condition  for  feasible  lossless  data  is  that  the 
symmetric  Toeplitz  matrix  in  (9)  be  positive  definite.  Fot  noisy 
data,  the  Toeplitz  matrix  may  not  meet  this  criteria,  and  fliis 
will  cause  the  algorithms  to  diverge.  In  these  situations,  the 
Toeplitz  matrix  can  be  forced  to  be  positive  definite  by  setting 
all  negative  eigenvalues  to  some  small  positive  niirnhw 

To  illustrate  this  point  we  added  zoo-mean  white  Gaussian 
noise  to  the  normal  probing  response  discussed  in  the  above 
example.  Since  this  model  was  shown  to  be  low-loss  (i.e., 
Ti.i+i  Fi  i+i),  we  argue  tiiat  the  response  {r(n)}  mustctHne 
close  to  meeting  the  lossless  feasibility  condition;  tbeteftne,  we 
ensure  it  does  by  altering  its  eigenvalues  as  discussed  (this  is 
the  projection  of  the  Toeplitz  matrix  (mto  the  space  <rf  positive 
definite  matrices).  We  added  both  low  (standard  deviatirm  = 
0.003)  and  high  (standard  deviation  =  0.03)  levels  of  Gawariiin 
noise  to  the  normal  response  {r(n)};  the  results  are  shown  in 
Fig.  6(a)  and  (b),  respectively. 

Without  implementing  the  data  feasibility  constrainL  we  ap¬ 
plied  the  symmetric  Levinson  algorithm  to  die  noisy  reflection 
data  {r(n)}.  This  gave  the  transfer  coefficients  {f}  shown  in 
Fig.  7. 

The  reconstruction  using  the  high  noise  data  is  shown  in 
Fig.  7(b).  Note  the  impossibly  high  ss  -35;  this  marks 
divergence  of  the  algorithm.  In  facL  the  noise  level  was  the 
minimum  needed  to  cause  the  routine  to  fail.  In  contrast,  the 


(»)  (b) 

Fig.  8.  Thresholded  reflection  dtta  for  low  noise  case,  a)  Levison  lecon- 
stniction,  b)  Botn  reconstmctioa. 


Fig.  9.  ReflectioD  coeCBcieot  reconstmctica  ot  higfa  noiae  a»ta  (a)  Levinson 
recons&uctioii,  (b)  Bora  lecoostractian. 

low  noise  reconstructkm  shown  in  Rg.  7(a)  is  mmely  a  noisy 
version  of  the  true  reflection  coefficintt  pn^le. 

As  proposed  earlier,  we  were  able  to  mqjrove  this  result 
by  thresholding.  Any  estimated  less  than  0.025  in 

magnitude  was  set  to  zero.  We  also  implCT^iented  a  Bom 
afqxoximation  (i.e.,  neglect  molt^  teflectirais)  on  the  low 
noise  data  and  tqjplied  dre  same  tfareshtdd.  The  results  are 
shown  in  Fig.  8<a)  anl  (b). 

We  note  that  die  Levmsen  reconstiuction  is  superior  to  the 
Bom  apiHoximatkn  in  dot  no  fidre  imofaces  were  obtained 
and  that  the  reconatructed  lefleedog  coefficients  are  closer  to 
the  true  values  (ThUe  OQ. 

We  now  investigate  mcaa8tBictlaa.widi  high  noise  d«rit 
Since  the  origitial  lefitmstnictioa  #re^ged.  we  allied  the 
positive  definite  constraint  to  the  symmetric  Toeplitz  response 
matrix.  The  symmetric  Levinson  algixithm  was  then  run  on 
the  modified  Toeplitz  response  matrix,  with  its  result  plotted 
in  Fig.  9(a).  Again  we  compare  this  reconstruction  with  that 
obtained  using  the  Bom  approximation  [Fig.  9(b)]. 


IK  ANT)  YAGLE:  reconstruction  OF  MULTILAYERED  LOSSY  DIELECTRICS  FROM  PLANE  WAVE  [MPLLSE  RESPONSES 


frol 


:’9 


In  the  presence  of  this  much  noise  it  is  hard  to  argue  that  one 
routine  is  superior  to  another,  (although  it  does  appear  that  the 
Bom  approximation  gives  better  estimates  of  the  coefficients). 

Xhe  main  conclusion  of  all  this  is  that  the  stability  of 
the  Levinson  reconstruction  is  data  dependent  and  when  the 
symmetric  Levinson  algorithm  is  used  for  the  inverse  problem, 
the  data  can  be  constrained  such  that  stability  is  ensured. 
Another  possibility  is  to  use  plane-wave  reflection  data  at 
more  than  two  angles  of  incidence.  Although  the  additional 
angles  provide  no  extra  information  for  noiseless  data,  they 
can  be  used  to  compute  least-squares  estimates  of  the  layer 
parameters.  This  was  done  in  [4]  for  the  acoustic  medium, 
and  significant  improvement  was  observed. 

VII.  Conclusion 

We  have  presented  and  discussed  some  new  algorithms 
for  reconstructing  a  lossy  layered  dielectric  from  its  reflec¬ 
tion  responses  to  plane  waves  at  two  different  angles  of 
incidence.  Losses  are  modeled  by  an  asymmetric  wave  sys¬ 
tem  associated  with  asymmetric  Toeplitz  matrices  and  the 
asymmetric  Levinson  algorithm.  This  permits  rapid  solution 
of  the  forward  and  inverse  scattering  problems.  A  novel 
approach  that  iterates  between  these  two  problems,  recursively 
reconstructing  another  layer  of  the  medium  at  each  iteration, 
was  also  presented;  this  approach  is  needed  since  absorption 
is  a  property  of  the  layers,  not  the  interfaces,  so  probing 
at  additional  angles  will  not  help.  Numerical  simulations 
of  an  ice  shelf  demonstrated  that  the  algorithms  work,  and 
provide  significant  improvement  over  models  that  neglect 
losses  (lossless  model)  and  multiple  reflections  (the  Bom 
approximation).  Performance  of  the  reconstruction  algorithms 
using  noisy  response  data  was  discussed. 

A  companion  paper  [5]  has  discussed  the  asymmetric  wave 
system  (8)  in  greater  detail,  along  with  algorithms  for  forward 
and  inverse  scattering  that  involve  transmission  as  well  as 
reflection  responses.  In  the  present  paper  we  have  focused 
on  reconstruction  of  lossy  media  from  reflection  responses 
only,  since  transmission  responses  are  generally  unavailable 
in  remote  sensing  applications.  Further  applications  in  remote 
sensing,  noted  in  Section  H,  and  nondestructive  testing  (for 
which  transmission  data  can  be  used),  constitute  possible 
topics  for  further  research. 
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APPENDIX  FI 

T.-S.  Pan  and  A.E.  Yagle,  “Acceleration  and  Filtering  in  the  Generalized 
Landweber  Iteration  using  a  Variable  Shaping  Matrix,”  IEEE  Trans.  Medical 
Imaging  12(2),  278-286,  June  1993. 

This  paper  discusses  the  generalized  Landweber  iteration  for  solving  large  linear  sys¬ 
tems  of  equations.  We  show  how  the  convergence  and  filtering  behavior  of  this  algorithm 
can  be  tightly  controlled,  in  contrast  to  most  iterative  algorithms  whose  behavior  cannot 
be  controlled  and  that  simply  go  where  they  may.  This  paper  is  a  good  introduction  to 
the  algorithm  and  how  to  design  it  to  obtain  desired  behavior. 

Although  the  specific  application  investigated  here  is  positron  emission  tomography, 
the  results  could  also  be  applied  to  discretized  integral  equations,  such  as  the  gener¬ 
alized  Gel’fand-Levitan  or  Marchenko  integral  equations.  Although  the  matrix  kernels 
are  no  longer  sparse,  a  projection  or  backprojection  (multiplication  by  the  kernel  or  its 
transpose)  can  be  implemented  quickly  using  FFT-based  convolution  methods;  number- 
theoretic  transforms  would  require  even  fewer  multiplications. 
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Acceleration  and  Filtering  in  the  Generalized 
Landweber  Iteration  Using  a  Variable  Shaping  Matrix 
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and  W.  Leslie  Rogers,  Member,  IEEE 


Abstract — We  use  the  generalized  Landweber  iteration  with 
a  variable  shaping  matrix  to  solve  the  large  linear  system  of 
equations  arising  in  the  image  reconstruction  problem  of  emission 
tomography.  Our  method  is  based  on  the  property  that  once  a 
spati^  frequency  image  component  is  almost  recovered  within 
€  in  the  generalized  Landweber  iteration,  this  component  will 
still  stay  within  e  during  subsequent  iterations  with  a  different 
shaping  matrix,  as  long  as  this  shaping  matrix  satisfies  the 
convergence  criterion  for  the  component  Two  different  shap¬ 
ing  matrices  are  used:  the  first  recovers  low-frequency  image 
components;  and  the  second  may  be  used  either  to  accelerate 
the  reconstruction  of  high-frequency  image  components,  or  to 
attenuate  these  components  to  filter  the  image.  The  variable 
shaping  matrix  gives  results  similar  to  truncated  inverse  filtering, 
but  requires  much  less  computation  and  memory,  since  it  does 
not  rely  on  the  singular  value  decomposition. 

I.  Introduction 

IN  emission  tomography,  the  image  reconstruction  problem 
can  be  formulated  as  the  solution  to  a  large  linear  system 
of  equations  [l]-[3].  This  system  of  equations  tends  to  be 
large,  with  dimensions  on  the  order  of  thousands,  and  ill- 
conditioned.  For  this  reason,  regularization  and  minimization 
of  computation  time  in  solving  it  become  important  issues. 

One  solution  [4]-[7]  is  to  perform  a  singular  value  de¬ 
composition  (SVD)  of  the  system  matrix  which  describes  the 
transformation  or  projection  process  of  the  imaging  system. 
Since  the  system  matrix  is  usually  ill-conditioned,  small  sin¬ 
gular  values  can  be  set  to  zero  to  obtain  a  stable  (not  sensitive 
to  the  measurement  noise)  solution.  This  approach  is  called 
truncated  inverse  filtering  (TIF)  [8].  Another  approach  is  to 
window  the  small  singular  values  to  zero  through  a  decreasing 
series  of  weight  factors  while  keeping  the  large  singular  values 
unchanged.  This  research  follows  the  latter  approach. 
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Due  to  the  attenuation,  the  imaging  system  in  emission 
tomography  is  spatially  variant.  The  point  spread  function  of 
the  imaging  system  varies  with  the  location  of  the  point  source. 
Fourier  spatial  frequencies  are  unsuitable  for  describing  or¬ 
thogonal  components  of  the  image.  Instead,  this  paper  defines 
“spatial  frequencies”  as  the  reciprocal  of  singular  values,  and 
the  associated  image  components  as  the  projection  of  the 
image  on  the  corresponding  singular  vectors.  This  terminology 
follows  [6],  [7].  Therefore,  high  spatial  frequency  components 
of  the  object  are  defined  as  components  on  singular  vectors 
associated  with  small  singular  values;  and  low  spatial  fre¬ 
quency  components  as  components  on  singular  vectors  with 
large  singular  values  [9],  [10].  Using  this  terminology,  high- 
frequency  components  of  the  reconstructed  image,  which  are 
sensitive  to  noise,  will  be  removed  in  TIF. 

To  compute  the  SVD  of  a  large  system  matrix  will  take  a 
long  time  and  much  memory  space.  Furthermore,  in  emission 
tomography,  attenuation  results  in  different  objects  having 
different  system  matrices,  each  of  which  would  require  an 
SVD,  Thus,  using  TIF  to  derive  a  solution  for  the  image 
reconstruction  in  emission  tomography  requires  a  huge  amount 
of  computation  and  memory  space. 

This  pap)er  presents  a  novel  and  alternative  approach — using 
the  generalized  Landweber  iteration  [11]  with  a  variable 
shaping  matrix  (defined  below).  This  method  can  either  1) 
accelerate  the  reconstruction  of  high-frequency  components; 
or  2)  roll  off  the  inverse  filter  to  prevent  Gibbs  phenomenon 
arising  from  the  sharp  truncation  in  TIF,  and  preserve  the  sta¬ 
bility  of  the  solution  when  the  system  matrix  is  ill-conditioned. 
The  motivation  for  using  a  variable  shaping  matrix  is  based 
on  the  recover-and-stay  property  of  the  generalized  Landweber 
iteration  which  is  stated  and  proved  for  the  first  time  in  this 
paper.  This  property  states  that  in  the  generalized  Landweber 
iteration,  once  an  image  component  is  recovered  within  e,  the 
component  will  still  stay  within  e  in  subsequent  iterations  even 
with  a  different  shaping  matrix,  as  long  as  this  shaping  matrix 
satisfies  the  convergence  criterion  for  the  component.  Our  new 
approach  uses  two  different  shaping  matrices:  the  first  one 
is  for  fast  recovery  of  low-frequency  components  and  partial 
recovery  of  high-frequency  components;  the  second  one  may 
be  used  either  for  speeding  up  the  reconstruction  of  high- 
frequency  components  (if  noise  is  not  a  problem),  or  for 
suppressing  high-frequency  components  while  maintaining  the 
recovered  low-frequency  components. 

There  are  three  advantages  in  using  this  new  approach  [12]: 
1)  the  number  of  iterations,  as  well  as  the  characteristics  of 
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the  filtering,  can  be  predetermined  before  the  iteration  starts; 
2)  the  computation  time  and  storage  space  are  greatly  reduced 
as  compared  to  TIF;  and  3)  once  a  filter  is  designed  from  two 
shaping  matrices,  a  different  filter  can  be  obtained  by  simply 
changing  the  gain  factor  in  the  generalized  Landweber  itera¬ 
tion,  or  by  using  the  dc-suppression  procedure  [13],  Therefore, 
by  changing  the  gain  factor,  a  spectrum  of  filters  can  be  derived 
and  investigated  off-line  to  find  out  which  provides  the  most 
suitable  solution  to  the  image  reconstruction  problem. 

This  paper  is  organized  as  follows.  Section  II  formulates 
the  image  reconstruction  problem  in  emission  tomography  as 
a  problem  of  solving  a  linear  system  of  equations,  reviews  TIF, 
and  also  reviews  two  alternative  iterative  methods  for  solving  a 
system  of  linear  equations:  algebraic  reconstruction  technique 
(ART)  and  conjugate  gradient  (CG)  method.  Some  drawbacks 
of  these  methods  will  be  discussed.  Section  III  summarizes  the 
generalized  Landweber  iteration  and  its  convergence  control 
and  recover-and-stay  property,  A  formal  proof  of  the  recover- 
and-stay  property,  as  well  as  the  design  of  a  shaping  matrix, 
are  also  discussed  and  presented.  This  section  complements 
original  work  on  the  generalized  Landweber  iteration  by 
Strand  [11]  in  1974;  the  reader  is  also  encouraged  to  read 
[11].  Section  IV  presents  the  new  variable  shaping  matrix 
method.  Two  examples  of  using  the  method  to  accelerate  the 
reconstruction  of  high-frequency  components,  and  to  attenuate 
the  high-frequency  components  in  the  image  to  achieve  regu¬ 
larization,  are  presented  separately.  The  flexibility  of  filtering 
in  the  generalized  Landweber  iteration  by  changing  the  gain 
factor  or  using  the  dc-suppression  procedure  is  also  discussed. 
A  comparison  of  computational  requirements  between  the 
variable  shaping  matrix  method  and  TTF  is  also  made.  Finally, 
Section  V  concludes  the  paper  with  a  summary  and  discussion 
of  possible  applications  of  the  variable  shaping  matrix  to  other 
signal  processing  problems. 

II.  Problem  Formulation  and  Background 

A.  System  Equation 

The  image  reconstruction  problem  in  emission  tomography 
can  be  formulated  [1],  [2]  as  the  solution  of  a  linear  system 
of  equations  represented  as 

Ax  =  b  (1) 

where  A  is  an  m  x  n  system  matrix  which  describes  the  system 
geometry,  i  is  an  n  x  1  vector  of  the  image  pixels,  and  h  is 
an  m  X  1  vector  of  the  measured  projections  of  the  image. 

The  problem  is  to  determine  x  given  A  and  b,  a  typical  task 
of  solving  a  linear  system  of  equations.  An  obvious  solution 
to  (1)  is  the  pseudoinverse  or  minimum-norm  least-squares 
solution 

I*  =  {A^A)^A^b  (2) 

where  A^  is  the  matrix  transpose  of  A  and  {A^ A)^  is  the 
pseudoinverse  of  A^ A  (which  may  not  have  full  rank).  The 
system  matrix  A,  in  addition  to  being  large  [14],  is  normally 
ill-conditioned,  so  that  a  small  perturbation  in  b  can  lead 
to  a  large  change  in  x* .  Therefore,  x*  is  an  unsatisfactory 


solution  if  there  is  noise  in  the  measured  projection  data  6. 
The  ill-conditioning  is  due  to  the  presence  of  high-frequency 
components  of  x’,  on  the  singular  vectors  with  small  singular 
values  of  A  [9],  [10]. 

B.  Truncated  Inverse  Filtering 

Truncated  inverse  filtering  (TIF)  [8]  is  used  to  derive  a 
stable  solution  of  ( 1 )  by  removing  high-frequency  components 
of  x',  which  are  very  sensitive  to  the  measurement  noise 
in  b.  A  TIF  solution  is  found  as  follows.  First,  compute 
the  SVD  of  A;  A  =  UT,V^.  Here  U  —  [ui.  U2,  •  •  • .  Um] 
and  V  =  [ui,  U2,  ■  •  • ,  u„]  are  orthogonal  matrices,  and  E 
is  “diagonal”  in  that  {E)i,j  =  0  unless  i  =  j,  in  which 
case  (E),,i  =  a„  the  singular  values  of  A.  Without  loss  of 
generality,  let  cti  >  <T2  >  ■  •  •  >  (rp{A]  >  0.  where  p(A)  is 
the  rank  of  A.  Second,  choose  a  threshold  or  and  discard  the 
image  components  on  singular  vectors  associated  with  singular 
values  a,  <  ot-  The  number  of  image  components  kept  will 
be  A:  <  min(m,  n).  Finally,  form  the  TEF  solution  using 

k 

^TIF  =  ^  ( 1  /(j, ) (6,  U, )y.  (3) 

t=l 

where  (6,  u,)  =  b^m.  This  aproach  is  straightforward  if 
the  size  of  A  is  small.  However,  in  image  reconstruction  in 
emission  tomography,  the  size  of  A  will  be  several  thousand 
by  several  thousand,  and  it  will  be  even  larger  in  3-dimensional 
image  reconstruction.  The  amount  of  computation  and  memory 
required  to  implement  (3)  can  become  enormous  (see  Section 
IV-D). 

C.  Alternative  Iterative  Approaches 

Suppose  the  scatter  can  be  corrected  sufficiently  on  the 
measured  projections  b.  Then  matrix  A  is  sparse  in  emission 
tomography  [14];  an  iterative  method  may  be  preferable  for 
solving  (1).  Among  iterative  approaches  [15]  to  solve  (1),  the 
algebraic  reconstruction  technique  (ART)  [1],  [3],  [16]  and  the 
conjugate  gradient  (CG)  method  [15],  [17]  have  received  the 
most  attention,  for  their  fast  speed  in  deriving  an  approximate 
solution  to  (1).  ART  is  a  technique  of  projection  onto  convex 
set  (POCS).  By  describing  each  individual  linear  equation  in 
(1)  as  a  hyperplane  (also  a  convex  set)  and  projecting  a  pre¬ 
vious  estimate  from  one  hyperplane  onto  another  hyperplane, 
a  sequence  of  approximate  solutions  can  be  derived  from  the 
iteration.  On  the  other  hand,  the  CG  method  utilizes  some 
orthogonality  conditions  [17].  The  sequence  of  approximate 
solutions  from  the  CG  iteration  is  guaranteed  to  converge  to 
X*  in  no  more  than  p{A)  iterations  if  a  zero  initial  condition 
is  used. 

Since  the  sequence  of  approximate  solutions  from  ART  or 
CG  depends  on  the  projection  b,  neither  ART  nor  CG  can 
guarantee  that  a  satisfactory  result  will  be  obtained  after  a 
given  number  of  iterations.  Moreover,  the  computation  in 
ART  is  nonparallel  [13],  [1],  In  actual  application,  both  the 
ART  and  CG  iterations  must  be  stopped  after  a  fixed  number 
of  iterations,  since  in  emission  tomography  the  sequence  of 
approximate  solutions  tends  to  first  approach  a  smooth  solution 
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(image),  then  deviate  from  the  smooth  image,  and  eventually 
approach  i”,  a  highly  noisy  solution.  Three  problems  exist  in 
using  ART  and  CG  iterations:  1)  choosing  a  stopping  point 
may  be  very  subjective;  2)  the  stopping  point  depends  on  the 
object  to  be  reconstructed;  and  3)  there  is  no  other  control 
over  the  reconstruction  process. 

It  has  been  shown  in  [13]  and  [18]  that  the  generalized 
Landweber  iteration  with  the  dc-suppression  procedure  can  be 
as  fast  as  the  ART  or  CG  iteration.  Moreover,  the  iteration  is 
governed  by  the  shaping  matrix  and  the  number  of  iterations. 
Once  the  shaping  matrix  and  the  number  of  iterations  are 
determined,  the  outcome  of  the  iteration  can  be  viewed  as 
a  known  filtering  operation  on  the  reconstructed  image.  The 
iteration  is  system  dependent,  not  object  dependent,  and  it 
allows  control  over  the  frequency  content  of  the  reconstructed 
image,  which  ART  and  CG  caimot  provide. 

III.  The  Generalized  Landweber  Iteration 
A.  The  Basic  Algorithm 

The  generalized  Landweber  iteration  has  the  form 

x''+i  =  1'= +aL>A^(6- Aa:'')  (4) 

where  a  is  a  gain  factor  usually  set  to  q  =  l/aj,D  is  a 
shaping  matrix  (a  polynomial  function  of  A)  and  is 
the  reconstructed  image  after  the  /cth  iteration.  Here  cri  is 
the  largest  singular  value.  For  convenience,  we  define  the 
multiplication  of  a  vector  by  A  as  a  forward  projection,  and  the 
multiplication  of  a  vector  by  A^  as  a  backward  projection.  In 
analyzing  the  computational  efficiency  of  an  iterative  method, 
the  number  of  forward  and  backward  projections  usually 
serves  as  a  better  indicator  than  the  number  of  iterations  does 
[13].  If  the  order  of  the  polynomial  function  used  to  represent 
D  is  I,  then  each  single  generalized  Landweber  iteration  will 
need  (/  +  1)  forward  and  backward  projections  [18]. 

When  D  =  I,  the  generalized  Landweber  iteration  becomes 
the  Landweber  iteration  [19].  When  initialized  with  zero,  the 
iteration  converges  to  i*  [18],  [20],  provided  the  Euclidean 
norm  \\aDA'^ A\\2  <  2  [11],  [21].  It  has  been  shown  that  the 
convergence  of  the  generalized  Landweber  iteration  can  be 
accelerated  by  using  a  dc-suppression  procedure  [13],  which 
makes  it  possible  to  use  a  bigger  gain  a  =  l/af  after  a  single 
Landweber  iteration. 

One  example  of  D  is  =  F(qA^A),  where  the  polynomial 
function  F(  )  is  chosen  to  be  [11] 

F{\)  =  31.5  -  315A  +  1443.75A2  - 

4-4504.5A‘‘-  3003A®-(-804.375A®.  (5) 

This  choice  of  polynomial  function  F(A)  is  made  because 
AF(A)  is  a  good  approximation  to  the  unit  step  function 
in  the  range  A  e  (0, 1].  This  covers  the  entire  spectrum  of 
frequency  components  from  the  highest  frequency  component 
(fT  0)  to  the  lowest  frequency  component  (cr  =  1)  after 
setting  a  =  l/ffi. 

Fig.  1  shows  the  filtering  effect  of  the  generalized  Landwe¬ 
ber  iteration  if  D  —  F{aA'^A)  as  in  (5).  It  is  clear  that  after 
about  3  iterations,  the  filter  does  not  change  much  from  one 


singulor  votue 

Fig.  1 .  The  first  30  filters  from  the  first  30  generalized  Landweber  iterations 
using  the  shaping  matrix  as  specified  in  (5)  and  with  o  =  1/<Tj. 


iteration  to  another,  and  that  the  filtering  gradually  changes 
from  low-pass  to  all-pass  as  the  number  of  iterations  increases. 

In  the  generalized  Landweber  iteration,  tri  is  usually  as¬ 
sumed  to  be  1  [11],  although  in  general  <ti  /  1  for  any  given 
system  matrix  A.  To  effectively  make  tri  =  1,  one  can  replace 
A  with  A  divided  by  tzi,  the  maximum  singular  value  of  A; 
projection  data  b  must  also  be  replaced  by  b  divided  by  ai 
(see  (1)].  Scaling  all  the  singular  values  of  A  by  oi  forces  all 
singular  values  of  A  to  lie  between  0  and  1. 

To  implement  the  generalized  Landweber  iteration,  one  can 
simply  set  a  =  l/crf  in  (4)  to  effectively  make  cti  =  1  and 
ensure  the  convergence  of  iteration.  It  has  been  shown  [13] 
Uiat  cTi  can  be  obtained  using  the  power  method  [22]  with  a 
few  forward  and  backward  projections.  We  now  discuss  some 
properties  of  the  generalized  Landweber  iteration  with  cti  =  1 
(which  leads  to  a  =  1),  as  in  [11]  and  [18]. 


B.  Convergence  Control 

In  the  generalized  Landweber  iteration,  £>  is  an  operator 
mapping  F(A^)  to  F",  a  real  vector  space  of  dimension  n. 
Since  R{A^)  is  spanned  by  {vi,  •  •  • ,  Up(A)}>  the  definition  of 
D  will  be  complete  if  the  image  Dvi  of  each  singular  vector  Vi 
in  R{A^)  is  specified  [11].  Strand  [11]  proposed  that  matrix 
D  be  designed  by  specifying  the  scalars  pi ,  ■  ■  • ,  Pp(A)  in 

Dvi  =  piVi,  0  <  piof  <  2  (6) 

where  the  condition  0  <  piof  <  2  ensures  the  convergence, 
and  will  be  referred  to  as  the  convergence  criterion  of  the 
generalized  Landweber  iteration. 

Suppose  that  the  initial  condition  is  x°  and  that  <ti  =  1  and 
a  =  (g/cTi)^  =  q^.  The  reconstructed  image  x‘‘  after  the  fcth 
generalized  Landweber  iteration  can  be  represented  as  [18] 


,  [1  -  (1  -  (p.g2)a.Y] . 

c*  =  ^  ' - « - -'{l/(Xi){b,Ui)vi 

data  gain 

(1  -  ^ 

+  2^  ' - - - ^  {^o,Vi)v,. 

initial  condition  gain 


Note  that  the  data  gain  and  the  initial  condition  gain,  after  k 
iterations,  are  governed  by  the  factor  (1  -  which 

illustrates  the  control  over  convergence  in  the  generalized 
Landweber  iteration.  By  varying  pi  and  q,  it  is  possible  to 
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control  the  convergence  rate  of  each  component  of  the  image 
independently.  And,  by  varying  k,  the  number  of  iterations,  the 
extent  of  convergence  can  be  controlled — some  components 
of  the  image  can  be  partially  filtered  by  stopping  the  iteration 
early. 

Recall  that  high-frequency  image  components  are  defined 
as  the  components  on  singular  vectors  associated  with  small 
singular  values;  low-frequency  image  components  are  associ¬ 
ated  with  large  singular  values.  It  is  important  to  note  that  if  a 
in  (4)  is  less  than  l/cr\,  say,  a  =  where  (7>  is  greater 

than  cTj,  i.e.,  q  <  1.0  in  (7),  then  the  data  gain  in  (7)  becomes 
smaller  than  it  is  with  9=1.  This  implies  that  the  data  gain 
for  each  singular  value  becomes  smaller  and  the  convergence 
of  the  generalized  Landweber  iteration  is  slower. 

On  the  other  hand,  if  we  use  q  =  l/cr^,  where  is 
smaller  than  cri,  i.e.,  q  >  1.0  in  (4),  then  the  data  gain 
in  (7)  becomes  larger  than  it  is  with  q  =  1.0.  It  has  been 
shown  in  [13]  that  doing  this  can  endanger  the  stability  of 
the  generalized  Landweber  iteration.  This  motivated  the  dc- 
suppression  procedure  of  [13]  to  set  a  =  \ja\  after  a  single 
Landweber  iteration.  Since  most  of  the  singular  values  in  this 
approach  are  equivalently  shifted  toward  1,  the  speed  of  the 
generalized  Landweber  iteration  will  be  accelerated. 


singular  volue 

Fig.  2.  The  number  of  iterations  required  to  recover  95%  of  the  object 
components  projected  onto  singular  vectors  associated  with  various  singular 
values  in  the  generalized  Landweber  iteration  using  F;(A)./  =  I.  -  .6. 
Discontinuous  stmcture  arises  because  there  are  /  -)-  1  forward  and  backward 
projections  per  complete  generalized  Landweber  iteration. 

Several  examples  of  polynomial  functions  Fi{\)  from  (10) 
with  a  =  0  and  orders  1  =  1,  •  ■  • ,  6  are  listed  as  follows: 

Fi(A)  =  4-3iA 

F2(A)  =  7.5  -  15A  -b  8.75A2 


C.  Design  of  a  Shaping  Matrix 

In  order  to  have  the  reconstructed  image  converge  faster 
to  the  minimum-norm  least-squares  solution  x’,  we  need  to 
design  a  polynomial  function  F(A)  such  that  F(A)  =  AF(A) 
is  as  close  to  one  as  possible  for  A  €  (0, 1].  Note  that  F(A) 
is  a  continuous  function  and  F(A  =  0)  =  0  no  matter  what 
F(A)  is  used.  Thus,  the  choice  of  F(A)  in  designing  a  shaping 
matrix  will  have  a  very  limited  effect  on  the  reconstruction  of 
the  high-frequency  components,  which  have  singular  values 
very  close  to  0. 

Suppose  F(A)  is  /th  order,  i.e., 

F(A)  =  AF(A)  =  ciA^ -b  C2A^ -1 - -bcjA'  (8) 


where  ci ,  C2,  ■  •  • ,  c;  are  scalars.  The  least  squares  criterion  [11] 


minimize  J  P‘^{X)dX  +  j\l-PiX)]  ^dX 


is  used  to  design  F(A),  where  a  is  a  parameter  indicating  a 
cutoff  value  between  0  and  1  for  designing  F(A).  Intuitively, 
(9)  specifies  a  shaping  matrix  D  =  F{aA'^A)  such  that 
frequency  components  with  singular  values  between  0  and 
a  will  be  recovered  slowly,  while  components  with  singular 
values  between  a  and  1  will  be  recovered  quickly.  If  o  > 
0,  the  resulting  shaping  matrix  will  attenuate  the  frequency 
components  with  singular  values  between  0  and  a.  Since  F(A) 
is  continuous  and  F(0)  =  0,  this  approach  shows  little  success 
[H]. 

Minimizing  (9)  with  given  order  I  and  cutoff  value  a  leads 
to  the  following  set  of  linear  equations: 


i 

E 


Cj 

1+  j  +  l 


1  - 
j  +  1 


7  =  1.  (10) 


F3(A)  =  12  -  42A  -b  56A^  -  25.2A^ 

F4(A)  =  17.5  -  93^A  -b  210X^  -  210A2  -b  77A^ 

F5(A)  =  24  -  180A  +  eOOA^  -  990A^ 

-b  792A^  -  245. 14286 A^ 

F6(A)  =  31.5  -  315A  -b  1443.75A2 

-  3465A3  +  4504.5A^  -  3003A®  -b  804.375A®. 

Note  that  F6(A)  is  the  same  as  F(A)  in  (5). 

Fig.  2  shows  the  number  of  iterations  required  to  recover 
95  percent  of  the  object  components  projected  onto  singular 
vectors  associated  with  various  singular  values  in  the  general¬ 
ized  Landweber  iteration  for  each  of  the  above  6  polynomial 
functions.  It  is  clear  that  using  a  higher  order  polynomial 
function  results  in  a  faster  reconstruction  of  high-frequency 
components  {o  <  0.2),  at  the  exponse  of  slower  reconstruction 
of  low-frequency  components  (cr  >  0.2).  After  about  20 
forward  and  backward  projections,  a  higher  order  polynomial 
function  outporforms  a  lower  order  polynomial  function  in 
the  sense  that  the  low-frequency  components  with  cr  >  0.2 
have  been  almost  completely  recovered,  and  the  high-order 
polynomial  function  has  recovered  more  of  the  high-frequency 
components  with  a  <  0.2. 

Moving  the  value  a  away  from  0  may  also  result  in 
desigmng  a  shaping  filter  with  negative  gains  in  the  high- 
frequency  range  associated  with  small  singular  values.  Com¬ 
bining  the  positive  and  negative  gains  in  a  range  of  frequency 
components  (usually  in  high-frequency  range)  can  result  in 
cancellation  of  the  frequency  components  in  this  range.  Since 
signal-to-noise  ratio  is  normally  small  in  the  high-frequency 
range,  the  high-frequency  components  (both  the  reconstructed 
image  and  the  reconstructed  noise)  may  cause  the  image  to 
look  noisy.  Removing  the  high-frequency  components  can 
result  in  a  smoother  reconstructed  image. 
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D.  Recover-and-Stay  Property 

We  now  state  the  recover-and-stay  property:  image  compo¬ 
nents.  once  recovered  within  some  e,  will  still  stay  within  e 
in  the  subsequent  iterations  even  if  D  is  changed  to  D',  as 
long  as  D'  satisfies  the  convergence  criterion  for  those  almost 
recovered  components.  A  formal  proof  of  this  property  is  as 
follows. 

Given  zero  initial  condition  =  0).  from  (7)  the  recon¬ 
structed  image  after  ki  generalized  Landweber  iterations 
is 

p(.-i) 

=  ^[1  -  (1  (11) 
i  =  l 

Suppose  the  first  I  components  have  been  reconstructed  to 
within  a  factor  e  in  i.e., 

1(1 <  e,  i  =  (12) 

Then  even  with  a  different  shaping  matrix  D'  such  that  D'vi  = 
p'l'i,  the  reconstructed  image  is  (from  (7)  with  xq  =  x*"*) 

P{A) 

x'=  =  [1-  (1  -  w.K 

1=1 
P{A) 

-I-  ^(1  -Pi£rf)*(x*^‘,t;,)i', 

!=1 

P(-4) 

=  -  (1  -p'(T.^)''](l/<Ti)(6,Ui)u, 

*  =  1 

i 

+  ,Vi)vi 

1=1 
PiA) 

+  (1 
t=i-(-l 

=  ^[1  -  (1 
1=1 
p(A) 

+  5]^  [1  -  (1 -p'<r?)*=](l/<Ti)(6,Ui)t;i 
1=H-1 

P(.A) 

+  ^  (1  -  p'(T,^)*(x*‘,Ui)i;i.  (13) 

i=i-i-i 

If  D'  satisfies  the  convergence  criterion  in  (6)  for  the  first  I 
components,  then 

|(1-p'(t2)''|  <  1,  i  =  (14) 

The  coefficients  in  the  first  summation  of  the  last  equation  of 
(13)  will  then  satisfy 

1-e  <  l-(l-p'(Tf)‘'(l-PiO-f)*‘  <  1  +  G  )  =  1, 

(15) 

As  long  as  D',  the  second  shaping  matrix,  satisfies  the  con¬ 
vergence  criterion  in  (6)  for  the  almost  recovered  components, 
the  maximum  deviation  for  these  components  will  still  be 
within  e  in  the  subsequent  iterations  with  D'.  We  will  show  by 


examples  that  it  is  not  difficult  to  have  e  <  0.05.  This  proves 
the  recover-and-stay  property  of  the  generalized  Landweber 
iteration. 

IV.  The  Variable  Shaping  Matrix  Method 

From  the  recover-and-stay  property,  different  shaping  matri¬ 
ces  may  be  used  to  emphasize  or  deemphasize  some  frequency 
components.  We  have  found  that  one  shaping  matrix  can  be 
used  for  the  first  several  iterations  to  recover  low-frequency 
components  of  an  image;  and  then  another  shaping  matrix, 
which  requires  less  computation  than  the  first  shaping  matrix 
in  a  single  generalized  Landweber  iteration,  can  be  used  to 
accelerate  the  reconstruction  of  high-frequency  components. 
We  also  found  that  if  the  second  shaping  matrix  provides 
negative  gains  in  a  range  of  high-frequency  components,  the 
partial  recovery  of  the  high-frequency  components  in  this 
range  using  the  first  shaping  matrix  can  be  removed.  In  order 
to  remove  some  components,  from  the  proof  of  the  recover- 
and-stay  property,  the  second  shaping  matrix  has  to  violate  the 
convergence  criterion  for  the  components  to  be  removed.  The 
result  of  removing  high-frequency  components  is  very  similar 
to  the  result  using  TIF.  Two  different  examples  are  provided  to 
demonstrate  1)  acceleration  of  convergence,  and  2)  attenuation 
of  high-frequency  components. 

A.  Acceleration  of  Reconstruction  of  High-Frequency 
Components 

Suppose  the  shaping  matrix  D  =  F{aA^ A),  where  F(-) 
is  defined  in  (5),  is  used  in  the  first  three  iterations  of  the 
generalized  Landweber  iteration  with  a  =  I/ctj.  From  Fig.  1, 
we  know  that  the  frequency  components  with  singular  values 
above  0.2  are  recovered  up  to  95  percent  (e  <  0.05)  in 
these  three  iterations  (which  are  equivalent  to  21  forward  and 
backward  projections). 

Starting  with  the  fourth  iteration,  suppose  the  iteration 
switches  the  shaping  matrix  D  =  F{aA'^A)  to  D'  = 
F'{aA^A),  where  F'(-)  is  defined  as 

F'(A)  =  25.94897  -  126.47433A 

-f  200.73822A2  -  100.03043A^.  (16) 

F'(A)  is  obtained  from  the  Chebyshev  approximation  [23]  of 
a  third-order  polynomial  to  F(A)  in  (5).  The  idea  is  to  use 
a  lower  order  polynomial  to  approximate  F(A)  to  maintain 
the  response  in  the  range  of  high-frequency  components, 
while  relaxing  the  respxjnse  in  the  range  of  low-frequency 
components,  which  have  almost  been  recovered. 

Fig.  3  shows  the  convergence  properties  of  (16)  for  the  first 
30  generalized  Landweber  iterations.  Although  there  are  large 
variations  from  one  iteration  to  another  iteration  in  the  range  of 
singular  values  [0.2, 1.0],  the  performance  in  the  range  from  0 
to  0.2  is  very  similar  to  that  of  using  (5)  in  Fig.  1.  Moreover, 
D'  still  satisfies  the  convergence  criterion  for  a  €  [0.2, 1.0] 
(see  Fig.  3). 

Fig.  4  shows  that  F'(A)  actually  outperforms  F(A)  in 
recovering  95  percent  of  the  high-frequency  components  with 
singular  values  a  €  (0,0.2].  Although  F'(A)  behaves  poorly 
when  a  £  [0.2, 1.0],  from  the  recover-and-stay  property,  F'(A) 
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Fig.  3.  Convergence  properties  of  the  third-order  polynomial  function  F'(X) 
for  the  first  30  generalized  Landweber  iterations.  The  first,  second,  third,  and 
final  iterations  are  labeled. 


singulor  value 


Fig.  4.  Comparison  of  performance  in  recovering  95%  of  components 
on  various  singular  values  using  the  generalized  Landweber  iteration 
with  F{\)  [marked  as  G-Landweber(6th)],  and  with  F'(\)  [marked  as 
G-Landweber(app)].  It  is  clear  that  F'(A)  outperforms  F(A)  in  <r  €  (0,0.2] 
since  it  requires  less  computation  than  F(\). 

can  be  applied  after  the  low-frequency  components  with  cr  e 
[0.2, 1.0]  have  been  almost  recovered  from  using  F{X). 

We  used  this  method  to  simulate  the  noise-free  reconstruc¬ 
tion  of  the  complex  phantom  in  a  simulated  positron  emission 
tomography  (PET)  geometry.  We  used  the  PET-3  geometry  in 
[13]  and  [18],  whose  system  matrix  is  4160  x  3228,  with  4160 
projection  elements  and  3228  image  pixels.  Fig.  5  compares 
the  performance  of  using  F{X)  alone,  and  the  performance  of 
using  F{X)  for  three  iterations  foUowed  by  using  F'{X).  This 
shows  how  the  variable  shaping  matrix  method  speeds  up  the 
reconstruction  of  high-frequency  components. 

B.  Attenuation  of  High-Frequency  Components 
in  the  Reconstruction 

We  now  demonstrate  another  way  of  using  a  variable 
shaping  matrix:  to  attenuate  high-frequency  components  in  the 
reconstructed  image  to  obtain  a  smooth  image.  This  can  also 
be  viewed  as  a  solution  to  a  regularized  image  reconstruction 
problem.  Using  a  shaping  matrix  D  =  F{aA'^A)  as  in  (5)  will 
partially  recover  some  high-frequency  components,  which  may 
(feteriorate  the  image  quality  significantly  if  these  components 
®re  noisy.  Thus,  in  some  applications,  it  may  be  desirable 
to  remove  high-frequency  components,  including  both  the 
reconstructed  image  and  the  reconstructed  noise,  in  the  range 
(say)  a  6  (0,0.1],  to  make  the  image  smooth.  We  now  show 


Fig.  5.  Comparison  berween  using  F(A)  alone  (marked  as  G-Landweber) 
and  using  F(A)  for  the  first  three  iterations  followed  by  using  F'  ( A )  [marked 
as  G-Landweber<app)].  The  plus  and  diamond  symbols  represent  the  result  of 
a  single  iteration.  Note  that  F'(A)  requires  less  compuution  than  F(  A )  does  to 
achieve  the  same  Euclidean  distance  between  the  original  and  reconstructed 
images. 

how  the  generalized  Landweber  iteration  can  do  this  during 
the  image  reconstruction  process. 

Suppose  the  polynomial  function  F{X)  in  (5)  is  used  for 
the  first  three  generalized  Landweber  iterations  to  ensure  that 
low-frequency  components  with  a  6  (0.2,1]  have  been  almost 
recovered.  Fig.  1  shows  the  response  using  F[X)  in  the  first 
three  generalized  Landweber  iterations.  Unfortunately,  it  is 
clear  that  high-frequency  components  with  a  6  [0,0.2]  have 
also  been  partially  recovered.  We  would  like  to  avoid  this. 

Now  suppose  the  pwlynomial  function 

F"(A)  =  -  10.972414  +  209.51804A  -  1072.02A2 

-  2599.2093A3  -  3313.8866A‘*  -f  2146.6546A^ 

-  557.55271 A®  (17) 

is  used  for  the  subsequent  iterations.  F"{X)  is  found  by 
designing  a  polynomial  function  [see  (10)]  with  cutoff  value 
a  —  0.15  and  order  (  =  6.  It  provides  negative  gains  in 
the  range  a  6  (0,0.28],  which  covers  cr  E  (0,0.2].  F"{X) 
is  used  for  another  nine  generalized  Landweber  iterations  to 
ensure  that  sufficient  attenuation  or  removal  of  high-frequency 
components  has  been  achieved.  In  this  case,  the  shaping  matrix 
D"  =  F{aA'^ A)  does  not  satisfy  the  convergence  criterion  for 
the  components  in  cr  E  (0,0.28].  Fig.  6(a)  shows  the  response 
using  F"{X)  alone  for  nine  generalized  Landweber  iterations. 
Note  that  this  iteration  cannot  be  allowed  to  continue  for  many 
iterations,  since  the  iteration  will  diverge. 

Now,  if  F(A)  is  used  for  the  first  three  generalized  Landwe¬ 
ber  iterations,  and  F"{X)  for  the  subsequent  nine  iterations, 
the  resulting  response  after  these  twelve  iterations  is  similar  to 
the  result  of  a  low-pass  filtering  with  a  transition  band  having 
low  cutoff  frequency  at  a\o  =  0.19  and  high  cutoff  frequency 
at  (7hi  =  0.08.  Fig.  6(b)  shows  the  transition  (iteration  by 
iteration),  combining  the  use  of  F(A)  for  three  iterations 
(soUd  lines)  and  the  use  of  F"{X)  for  the  subsequent  nine 
iterations  (dotted  lines).  The  resulting  low-pass  filter  after  the 
twelve  iterations  has  the  low  cutoff  and  high  cutoff  frequencies 
at  cThi  =  0.08  and  ctio  =  0.19,  respectively.  TTie  cutoff 
frequencies  Ohi  and  ctio  were  chosen  at  the  frequencies  with 
gains  0.05  and  0.95,  separately.  In  general,  this  type  of  shaping 
is  also  referred  to  as  rolloff  shying  [8].  For  the  components 
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Fig.  6.  (a)  The  response  using  F"  { A )  for  the  firet  nine  generalized  Landwe- 
ber  iterations,  (b)  The  transition  from  combining  F(A)  for  three  iterations 
(solid  lines)  and  F''(A)  for  the  subsequent  nine  iterations  (dotted  lines).  The 
resulting  lowpass  filter  (rightmost  dotted  line)  after  the  twelve  iterations  has 
the  low  cutoff  and  high  cutoff  frequencies  at  <Thi  =  0.08  and  <tio  =  0.19,  re¬ 
spectively.  The  cutoff  frequencies  <ri,i  and  <T|o  were  chosen  at  the  frequencies 
with  gains  0.05  and  0.95,  separately. 


in  O'  €  [0.2,0.28],  where  the  almost  recovered  components  and 
to-be-removed  components  overlap,  the  second  shaping  does 
not  affect  much  for  the  small  e  (about  less  than  0.01). 

We  used  this  method  to  simulate  the  reconstruction  of  the 
complex  phantom  in  the  simulated  PET-3  geometry  from  noisy 
data.  The  number  of  total  counts  was  1  million;  the  data  noise 
was  Poisson  in  nature.  Fig.  7  shows  the  result  of  the  twelve 
iterations.  Compare  the  third  and  the  twelfth  reconstructed 
images  of  Fig.  7.  As  expected,  the  twelfth  reconstructed  image 
is  smoother  than  the  third  image. 

An  important  feature  of  the  generalized  Landweber  iteration 
can  also  be  seen  in  Fig.  7 — not  only  have  we  obtained 
the  reconstructed  images,  but  we  also  have  a  known  rela¬ 
tion  between  a  reconstructed  image  and  the  shaping  filter 
generating  the  image.  This  known  relation  is  unique  to  the 
generalized  Landweber  iteration;  it  is  not  present  in  the  ART 
or  CG  iterations.  Since  F{X)  and  F"{X)  are  both  polynomial 
functions  of  order  6,  84(=  (6-1- 1)  x  12)  forward  and  backward 
projections  are  known  to  be  needed  to  obtain  the  desired  image. 

In  examining  the  sequence  of  images  in  Fig.  7,  note  the 
change  due  to  the  amplification  and  attenuation  of  the  com¬ 
ponents  in  cr  €  [0.2, 1.0].  The  deterioration  of  the  third 
image  was  due  to  the  partial  reconstruction  of  high-frequency 
components.  The  change  in  image  quality  from  the  first  image 
to  the  second  image,  and  from  the  second  image  to  the  third 
image,  shows  that  partial  reconstruction  of  the  high-frequency 


Fig.  7.  The  reconstructed  images  from  the  first  three  generalized  Landweber 
iterations  using  F(A)  and  the  nine  subsequent  iterations  using  F"lA). 


components  can  deteriorate  image  quality.  This  shows  the 
necessity  of  filtering  out  the  high-frequency  components  to 
smooth  the  image. 

C.  A  Family  of  Filters  Based  on  the  Same  Shaping 
Matrix  with  a  Different  Gain  Factor 

We  show  that  it  is  possible  to  use  the  same  shaping  filter 
with  a  different  gain  factor  q  in  (4)  to  target  different  ranges 
of  high-frequency  components.  Thus,  noise  filtering  can  be 
performed  without  designing  another  shaping  filter. 

For  example,  set  q  =  llcr\  and  D  =  I  during  the  first 
iteration;  then  adopt  the  dc-suppression  procedure  and  set 
a  =  l/crf  and  D  =  F(qA^A)  for  the  subsequent  iterations. 
The  low  cutoff  and  high  cutoff  transition  band  frequencies 
will  change  from  ajo  and  Uhi  to  (o2/cri)oio  and  {o2l(Ti)oh\, 
respectively;  the  low  cutoff  ctjo  and  high  cutoff  crhi  frequencies 
are  both  multiplied  by  the  ratio  02lcri.  In  our  example, 
oi  =  0.986703  and  02  =  0.587256.  If  a  =  (?/<7l)^  where 
g  <  1,  then  the  low  cutoff  and  high  cutoff  transition  band 
frequencies  become  (l/g)crio  and  (l/g)crhi,  respectively. 

Fig.  8  shows  the  frequency  responses  after  three  generalized 
Landweber  iterations,  using  the  p>olynomial  function  F{X)  in 
(5),  with  the  gain  factor  a  set  equal  to:  l/crl,  l/<7?,  (0.8/(7i)  , 
(0.7/(ti)^,  (0.6/o•l)^  (0.5/<7l)^  and  (0.4/(Ti)  ,  separately. 
Fig.  9  shows  the  frequency  resp>onses  and  cutoff  frequencies 
of  using  the  resulting  filter  after  12  iterations  in  Fig.  6(b), 
with  the  gain  factor  a  set  equal  to  l/tr^,  I/'T’?,  (0.8/(7i)  , 
(0.7/oif,  (0.6/(Ti)^,  (0.5/^Tl)^  and  {OA/oif,  separately. 
We  call  the  shaping  filters  in  Fig.  8  unregularized  filters,  and 
the  shaping  filters  in  Fig.  9  regularized  filters. 

The  reconstructed  images  using  the  unregularized  filters  in 
Fig.  8  and  the  regularized  filters  in  Fig.  9  are  shown  in  Figs. 
10  and  11,  respectively.  The  number  at  the  upper-left  comer 
of  each  image  corresponds  to  the  filter  with  the  same  number. 
The  number  0  indicates  the  original  image.  As  expected,  the 
images  in  Fig.  1 1  are  smoother  than  the  corresponding  images 
in  Fig.  10.  One  important  extension  of  using  a  family  of 
shaping  filters  is  to  allow  us  to  choose  an  appropriate  filter 
in  the  image  reconstruction. 
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Fig.  8.  The  frequency  responses  after  the  first  three  generalized  Landweber 
iterations  using  F(X)  with  o  =  1/<t|,  1/cf,  (0.8/cti)^,  {0.7/oi)^, 
(0.6/<ti  (0.5/<ti  and  (OA/ci )‘,  separately. 


Fig.  9.  The  frequency  responses  using  the  resulting  filter  with  a  =  l/<r^, 
l/trf,  (0.8/fTi)^  (0.7/<ti)^  (0.6/<ti)^  (0.5/<ti)^  and  {0.4/crif.  sep¬ 
arately. 


Fig.  10.  The  reconstructed  images  from  the  unregularized  filters. 


Fig.  1 1 .  The  reconstructed  images  from  the  regularized  filters. 


D.  Comparison  of  Computational  Requirement  of 
Rolloff  Shaping  and  Truncated  Shaping 

We  compare  the  computational  requirements  for  a  rolloff 
shaping  filter  (see  Section  IV-B)  and  a  truncated  shaping  filter 
as  in  TIP.  The  major  difference  is  that  the  rolloff  shaping  can 
be  iteratively  performed  (84  forward  and  backward  projections 
for  the  shapings  in  Fig.  9).  while  the  truncated  shaping  must  be 
computed  directly  using  (2);  this  requires  an  enormous  amount 
of  computation. 

Direct  computation  of  x*  using  (3)  requires  storage  of  the 
matrices  U  and  V  and  a  considerable  amount  of  computation 
to  implement  (3).  since  U  and  V  (unlike  -4)  are  not  sparse. 
It  also  requires  the  (off-line)  computation  of  the  SVD  of  A. 
Consider  a  4(X)0  x  4000  system  matrix  A,  and  assume  using 
4  bytes  to  store  a  number  in  floating-point  format.  Then  128 
million  (=  4000  x  4000  x  4  x  2)  bytes  will  be  required  just  to 
store  U  and  On  the  other  hand,  if  A  is  97  percent  sparse 
(3  percent  of  the  elements  of  A  are  nonzero),  the  computation 
time  for  obtaining  x*  in  (3)  is  equivalent  to  about  33(r: 2/0.06) 
forward  and  backward  projections  when  the  sparse  structure  of 
A  is  utilized  in  the  iteration.  If  k  in  (3)  is  chosen  to  preserve 
half  of  the  singular  values,  then  17  forward  and  backward 
projections  are  required  for  the  computation. 

In  some  applications,  the  system  matrix  A  without  modeling 
scatter  becomes  sparser  as  its  size  increases.  For  example,  a 
150-million-elements  system  matrix  in  [14]  has  about  2  million 
nonzero  elements,  so  that  it  is  98.7  percent  sparse.  In  this 
case,  an  SVD  computation  will  become  almost  impossible. 
The  computation  time  for  obtaining  x*  is  equivalent  to  about 
75(ss2/0.026)  forward  and  backward  projections.  If  half  of  the 
singular  values  are  preserved,  about  37  forward  and  backward 
projections  will  be  needed.  Hence,  even  apart  from  the  storage 
problem  and  the  difficulty  in  computing  the  SVD  of  A,  the 
iterative  approach  of  rolloff  shaping  becomes  more  favorable 
over  the  direct  approach  using  I  IF  as  the  size  and  sparseness 
of  A  grows. 

V.  Summary 

A  new  variable  shaping  matrix  method  for  using  the  gener¬ 
alized  Landweber  iteration  in  image  reconstruction  in  emission 
tomography  has  been  developed.  The  method  uses  two  shaping 
matrices;  the  first  one  is  for  fast  recovery  of  low-frequency 
components;  and  the  second  one  is  either  for  the  accel¬ 
eration  of  the  reconstruction  of  high-frequency  components 
if  the  noise  does  not  impose  a  problem  in  recovering  the 
high-frequency  components,  or  for  the  attenuation  of  high- 
frequency  components  when  noise  is  a  problem.  The  recover- 
and-stay  property  of  the  generalized  Landweber  iteration  has 
been  stated  and  proved  for  the  first  time  in  this  paper;  it  serves 
as  the  foundation  of  the  variable  shaping  matrix  method. 

Examples  of  using  a  variable  shaping  matrix  to  accelerate 
reconstruction  and  to  attenuate  high-frequency  components 
were  presented.  A  sequence  of  filters  from  the  same  shaping 
matrices  with  different  gain  factors  were  also  shown.  From  this 
sequence,  one  may  choose  an  ^propriate  filter  for  the  image 
reconstruction.  The  savings  in  computation  and  storage  space 
in  using  a  variable  shaping  matrix  instead  of  TIF  was  also 
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presented;  it  results  from  the  sparseness  of  .4,  which  generally 
increases  as  the  size  of  A  increases.  Hence,  the  larger  ^e 
system,  the  greater  the  advantage  of  the  new  method  over  TIF_ 
An  important  topic  for  further  research  is  application  of 
the  generalized  Landweber  iteration  with  a  variable  shaping 
matrix  to  linear-algebra-based  signal  processing  problems.  The 
solution  (2)  is  used  in  signal  restoration  problems,  harmonic 
retrieval  and  deconvolution.  The  method  proposed  in  this 
paper  can  be  applied  to  all  of  those  problems,  even  if  the 
matrix  is  not  sparse;  the  computation  required  is  less  than  that 
for  an  SVD.  Another  topic  is  further  study  on  designing  the 
shaping  matrix  to  achieve  a  given  specification  of  low  and 
high  cutoff  frequencies  with  the  least  amount  of  computation. 
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Abstract 

We  use  a  variable  shaping  matrix  in  the  generalized 
Landweber  iteration  to  solve  the  large  linear  system  of 
equations  arising  in  the  image  reconstruction  problem  of 
emission  tomography.  Our  method  is  based  on  the  prop¬ 
erty  that  once  a  spatial  frequency  image  component  is  re¬ 
covered  in  the  generalized  Landweber  iteration,  this  com¬ 
ponent  will  not  change  during  subsequent  iterations,  even 
if  a  different  shaping  matrix  is  used.  Two  different  shaping 
matrices  are  used:  the  first  recovers  low-frequency  image 
components;  and  the  second  may  be  used  either  to  accel¬ 
erate  the  reconstruction  of  high-frequency  image  compo¬ 
nents,  or  to  attenuate  these  components  to  filter  the  im¬ 
age.  The  variable  shaping  matrix  gives  results  similar  to 
truncated  inverse  filtering,  but  requires  much  less  compu¬ 
tation  and  memory,  since  it  does  not  rely  on  the  singular 
value  decomposition. 

I.  Introduction 

In  emission  tomography,  the  image  reconstruction  prob¬ 
lem  can  be  formulated  as  the  solution  to  a  large  linear 
system  of  equations.  This  system  of  equations  tends  to  be 
large,  with  dimensions  on  the  order  of  thousands,  and  ill- 
conditioned.  One  approach  to  this  problem  is  to  perform 
a  singular  value  decomposition  (SVD)  of  the  system  ma¬ 
trix  and  set  some  small  singular  values  to  zero  to  obtain 
a  stable  (not  sensitive  to  the  measurement  noise)  solution. 
This  approach  is  called  truncated  inverse  filtering  (TIF) 

This  paper  defines  “spatial  frequencies”  as  the  recipro¬ 
cals  of  singular  values,  and  the  associated  image  compo¬ 
nents  as  the  projection  of  the  image  on  the  corresponding 
singular  vectors.  High  spatial  frequency  components  of  the 
object  are  defined  as  components  on  singular  vectors  asso¬ 
ciated  with  small  singular  values;  and  low  spatial  frequency 
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#POl-CA42768,  and  in  part  by  a  Research  Partnership  award  from 
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components  as  components  on  singular  vectors  with  large 
singular  values.  Using  this  terminology  in  TIF,  high  fre¬ 
quency  components  of  the  reconstructed  image,  which  are 
sensitive  to  noise,  will  be  removed. 

To  compute  the  SVD  of  a  large  system  matrix  will  take  a 
long  time  and  much  memory  space.  Furthermore,  in  emis¬ 
sion  tomography,  attenuation  results  in  different  objects 
having  different  system  matrices,  each  of  which  would  re¬ 
quire  a  SVD.  Thus  using  TIF  to  derive  a  solution  for  the 
image  reconstruction  in  emission  tomography  requires  a 
huge  amount  of  computation  and  memory  space. 

This  paper  presents  a  novel  and  alternative  approach  - 
using  the  generalized  Landweber  iteration  [2]  with  a  vari¬ 
able  shaping  matrix  (defined  below).  This  method  can 
either  1)  accelerate  the  reconstruction  of  high  frequency 
components;  or  2)  roll  off  the  inverse  filter  to  prevent 
Gibbs  phenomenon  arising  from  the  sharp  truncation  in 
TIF,  and  preserve  the  stability  of  the  solution  when  the 
system  matrix  is  ill-conditioned.  The  motivation  for  us¬ 
ing  a  variable  shaping  matrix  is  bzised  on  the  recover-and- 
stay  property  of  the  generalized  Landweber  iteration  which 
is  stated  and  proved  in  [3].  This  property  states  that  in 
the  generalized  Landweber  iteration,  once  an  image  compo¬ 
nent  is  completely  recovered,  the  component  will  not  change 
during  further  iterations,  even  if  a  different  shaping  ma¬ 
trix  is  used  afterwards.  Our  new  approach  uses  two  dif¬ 
ferent  shaping  matrices;  the  first  one  is  for  fast  recovery 
of  low  frequency  components  and  partiad  recovery  of  high 
frequency  components;  the  second  one  may  be  used  either 
for  speeding  up  the  reconstruction  of  high  frequency  com¬ 
ponents  (if  noise  is  not  a  problem),  or  for  suppressing  high 
frequency  components  while  maintaining  the  recovered  low 
frequency  components. 

There  are  three  advantages  in  using  this  new  approach 
[3]:  1)  the  number  of  iterations,  as  well  as  the  chairacter- 
istics  of  the  filtering,  can  be  predetermined  before  the  it¬ 
eration  starts;  2)  the  computation  time  and  storage  space 
are  greatly  reduced  as  compared  to  TIF;  and  3)  once  a 
filter  is  designed  from  two  shaping  matrices,  a  different 
filter  can  be  obtained  by  simply  changing  the  gain  factor 
in  the  generalized  Landweber  iteration,  or  by  using  the 
DC-suppression  procedure  [3].  Therefore,  by  changing  the 
gain  factor,  a  spectrum  of  filters  can  be  derived  and  inves- 
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tigated  off-line  to  find  out  which  provides  the  most  suitable 
solution  to  the  image  reconstruction  problem. 

This  paper  is  organized  as  follows.  Section  II  formulates 
the  image  reconstruction  problem  in  emission  tomography 
as  a  problem  of  solving  a  linear  system  of  equations,  re¬ 
views  TIF  and  two  alternative  iterative  methods  for  solv¬ 
ing  a  system  of  linear  equations:  algebraic  reconstruction 
technique  (ART)  and  conjugate  gradient  (CG)  method, 
and  also  reviews  the  generalized  Landweber  iteration.  Sec¬ 
tion  III  presents  the  new  variable  shaping  matrix  method. 
Two  examples  of  using  the  method  to  accelerate  the  recon¬ 
struction  of  high  frequency  components,  and  to  attenuate 
the  high  frequency  components  in  the  image  to  achieve 
regularization,  are  presented  separately.  A  comparison  of 
computational  requirements  between  the  variable  shaping 
matrix  method  and  TIF  is  made.  Finally,  Section  IV  con¬ 
cludes  the  paper  with  a  summary  and  discussion  of  possible 
applications  of  the  variable  shaping  matrix  to  other  signal 
processing  problems. 


II.  Background 

A.  System  Equation 

The  image  reconstruction  problem  in  emission  tomogra¬ 
phy  can  be  formulated  [4]  as  to  solve  the  linear  system  of 
equations  represented  as 


Ax  =  b,  (1) 

where  A  is  an  m  x  rr  system  matrix  which  describes  the 
system  geometry,  x  is  an  n  x  1  vector  of  the  image  pixels, 
and  6  is  an  m  X  1  vector  of  the  measured  projections  of  the 
image. 

The  problem  is  to  determine  x  given  A  and  6,  a  typi¬ 
cal  task  of  solving  a  linear  system  of  equations.  An  obvi¬ 
ous  solution  to  (1)  is  the  pseudo-inverse  or  minimum-norm 
least-squares  solution: 


and  V  =  [fi,  f2, . . . ,  Un]  are  orthogonal  matrices,  and  E  is 
“diagonal”  in  that  <E>,j=  0  unless  i  =  j,  in  which  case 
<E>i,i=  <7,  ,  the  singular-values  of  A.  Without  loss  of  gen¬ 
erality,  let  (Ti  >  <72  >  ■  •  •  >  a-p(A)  >  0,  where  p{A)  is  the 
rank  of  A.  Second,  choose  a  threshold  <7t  and  discard  the 
image  components  on  singular  vectors  associated  with  sin¬ 
gular  values  <7,  <  <7t.  The  number  of  image  components 
kept  will  be  F  <  m,  n.  Finally,  form  the  TIF  solution  using 


^TIF 


where  (6,  u.)  =  .  This  approach  is  straightforward  if 

the  size  of  A  is  small.  However,  in  image  reconstruction 
in  emission  tomography,  the  size  of  A  will  be  several  thou¬ 
sand  by  several  thousand,  and  it  will  be  even  larger  in  3- 
dimensional  image  reconstruction.  The  amount  of  compu¬ 
tation  and  memory  required  to  implement  (3)  can  become 
enormous,  even  though  the  SVD  is  precomputed  off-line 
(see  subsection  III.C). 


C.  Alternative  Iterative  Approaches 

Since  the  matrix  A  is  sparse  in  emission  tomography, 
an  iterative  method  may  be  preferable  when  solving  (1). 
Among  iterative  approaches  [6]  to  solve  (1),  the  algebraic 
reconstruction  technique  (ART)  and  the  conjugate  gradi¬ 
ent  (CG)  method  have  received  most  attention,  for  their 
fast  speed  in  deriving  an  approximate  solution  to  (1).  ART 
is  a  technique  of  projection  onto  convex  set  (POCS).  CG 
method  utilizes  some  orthogonality  conditions.  Since  the 
sequence  of  approximate  solutions  from  ART  or  CG  de¬ 
pends  on  the  projection  6,  neither  ART  nor  CG  can  guar¬ 
antee  that  a  satisfactory  result  will  be  obtained  after  a 
given  number  of  iterations. 


D.  Generalized  Landweber  Iteration 


X*  =  (A^A)fA^6,  (2) 

where  is  the  matrix  transpose  of  A  and  (A^A)f  is  the 
pseudo-inverse  of  A^A  (which  may  not  have  full  rank). 
The  system  matrix  A,  in  addition  to  being  large  [5],  is 
normally  ill-conditioned,  so  that  a  large  change  in  x*  can 
result  from  a  small  perturbation  in  6.  Therefore,  x*  is  an 
unsatisfactory  solution  if  there  is  noise  in  the  measured 
projection  data  6.  The  ill-conditioning  is  due  to  the  pres¬ 
ence  of  high  frequency  components  of  x*,  on  the  singular 
vectors  with  small  singular  values  of  A. 


It  has  been  shown  [7]  that  the  generalized  Landweber  iter¬ 
ation  with  the  DC-suppression  procedure  can  be  as  fast  as 
the  ART  or  CG  iterations.  Moreover,  the  iteration  is  gov¬ 
erned  by  the  shaping  matrix  and  the  number  of  iterations. 
Once  the  shaping  matrix  and  the  number  of  iterations  are 
determined,  the  outcome  of  the  iteration  can  be  viewed 
as  a  known  filtering  operation  on  the  reconstructed  image. 
The  iteration  is  system  dependent,  not  object  dependent, 
and  it  eillows  control  over  the  frequency  content  of  the  re¬ 
constructed  image,  which  ART  and  CG  cannot  provide. 

The  generadized  Landweber  iteration  has  the  form 


B.  Truncated  Inverse  Filtering 

Truncated  inverse  filtering  (TIF)  [1]  is  used  to  derive  a 
stable  solution  of  (1)  by  removing  the  high  frequency  com¬ 
ponents  of  X* ,  which  are  very  sensitive  to  the  measurement 
noise  in  6.  The  TIF  solution  is  found  as  follows.  First,  com¬ 
pute  the  SVD  of  A:  A  =  UEV'^ .  Here  t/  =  [uj,  uj,  ■  -  - .  Um] 


x*+i  =x*-|-qDA^(6- Ax*),  (4) 

where  a  is  a  gedn  factor  usually  set  to  a  =  ^ ,  D  is  a  shap¬ 
ing  matrix  (a  polynomial  function  of  qA'^A)  sind  x*  is  the 
reconstructed  image  after  the  i-th  iteration.  Here  <7i  is 
the  largest  singular  value.  For  convenience,  we  define  the 
multiplication  of  A  by  a  vector  as  a  forward  projection. 
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and  the  multiplication  of  AJ  by  a  vector  as  a  backward 
projection.  In  analyzing  the  computational  efficiency  of 
an  iterative  method,  the  number  of  forward  and  backward 
projections  usually  serves  as  a  better  indicator  than  the 
number  of  iterations  does  [7].  If  the  order  of  the  polyno¬ 
mial  function  used  to  represent  D  is  I,  then  each  single 
generalized  Landweber  iteration  will  need  (/  +  1)  forward 
and  backward  projections  [8].  When  initialized  with  zero, 
the  iteration  converges  to  z’,  provided  the  Euclidean  norm 

WaDA^Ah  <  2  [2]. 

One  example  of  D  is  D  =  F{aA^ A)  where  the  polyno¬ 
mial  function  F{-)  is  chosen  to  be  [2] 

F(A)  =  31.5-315A-1-  1443.75A^-3465A^ 

-f4504.5A‘‘  -  3003A®  A  804.375A®.  (5) 

This  choice  is  made  because  AF(A)  is  a  good  approxima¬ 
tion  to  the  unit  step  function  in  the  range  A  €  (0, 1].  This 
covers  the  entire  spectrum  of  frequency  components  from 
the  highest  frequency  component  <t  =  0  to  the  lowest  fre¬ 
quency  component  cr  =  1  after  the  normalization  ot  — 

III.  VARIABLE  SHAPING 

From  the  recover-and-stay  property,  we  have  found  that 
one  shaping  matrix  can  be  used  for  the  first  several  it¬ 
erations  to  recover  the  low  frequency  components  of  the 
image,  and  then  another  shaping  matrix,  which  requires 
less  computation  than  the  first  shaping  matrix  in  a  single 
generalized  Landweber  iteration,  can  be  used  to  acceler¬ 
ate  the  reconstruction  of  high  frequency  components.  We 
also  found  that  if  the  second  shaping  matrix  provides  neg¬ 
ative  gains  in  a  range  of  high  frequency  components,  the 
partial  recovery  of  the  high  frequency  components  in  this 
range  using  the  first  shaping  matrix  can  be  removed.  The 
result  is  very  similar  to  the  result  using  TIF.  Two  differ¬ 
ent  examples  are  provided  to  demonstrate  1)  acceleration 
of  the  convergence  of  high  frequency  components,  and  2) 
attenuation  of  high  frequency  components. 

A.  Acceleration  of  Reconstruction  of  High  Fre¬ 
quency  Components 

Suppose  the  shaping  matrix  D  -  F(aA^A),  where  F(  ) 
is  defined  in  (5),  is  used  in  the  first  three  generalized  it¬ 
erations  with  a  =  The  frequency  components  with 
singular  values  above  0.2  will  be  recovered  in  these  three 
iterations  (which  are  equivalent  to  21  forward  and  back¬ 
ward  projections).  Fig.  1  shows  the  filter  after  the  three 
iterations. 

Starting  with  the  fourth  iteration,  suppose  the  iter¬ 
ation  switches  the  shaping  matrix  D  =  F{aAJ A)  to 
O'  =  F'{aA^A),  where  F'(-)  is  defined  as 

F'(A)  =  25.94897  -  126.47433A 

4-200. 73822 A^  -  100.03043A®.  (6) 


singulor  value 

Figure  1:  The  resulting  filter  after  the  twelve  iterations. 
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singulor  volue 

Figure  2;  Comparison  of  performance  in  recovering 
95%  of  components  on  various  singular  values  using 
the  generalized  Landweber  iterations  with  F(A)  (marked 
as  G-Landweber(6th)),  and  with  F'(A)  (marked  as  G- 
Landweber(app)).  It  is  clear  that  F'(A)  out-performs  F(A) 
in  <T  €  (0, 0.2],  since  it  requires  less  computation  than  F(A). 

F'(A)  is  obtained  from  the  Chebyshev  approximation  [9]  of 
a  third-order  polynomial  to  F(A)  in  (5).  The  idea  is  to  use 
a  lower  order  polynomial  to  approximate  F(A)  to  maintain 
the  response  in  the  range  of  high  frequency  components, 
while  relaxing  the  response  in  the  range  of  low  frequency 
components,  which  have  already  been  recovered.  Fig.  2 
shows  that  F'(A)  actually  out-performs  F(A)  in  recovering 
95%  of  the  high-frequency  components  with  singular  values 
<r€  (0,0.2]. 

B.  Attenuation  of  High  Frequencies 

We  now  demonstrate  another  way  of  using  a  variable  shap¬ 
ing  matrix:  to  attenuate  high  frequency  components  in  the 
reconstructed  image  to  obtain  a  smooth  image.  This  can 
also  be  viewed  as  a  solution  to  a  regularized  image  recon¬ 
struction  problem.  Using  a  shaping  matrix  D  =  F{aA^  A) 
as  in  (5)  will  partially  recover  some  high  frequency  com¬ 
ponents,  which  may  deteriorate  the  image  quality  signifi- 


cantly  if  these  components  are  noisy.  Thus  it  may  be  desir¬ 
able  to  remove  the  high  frequency  components,  including 
both  the  reconstructed  image  and  the  reconstructed  noise, 
in  the  range  (say)  <t  G  (0, 0.1],  to  make  the  image  smooth. 
We  now  show  how  the  generalized  Landweber  iteration  can 
do  this  during  the  image  reconstruction  process. 

Suppose  the  polynomial  function  F{\)  in  (5)  is  used  for 
the  first  three  generalized  Landweber  iterations  to  ensure 
that  low  frequency  components  with  <r  €  (0.2, 1]  have  been 
recovered  (see  Fig.  1). 

IVow  suppose  the  following  polynomial  function  [3] 

F"{X)  =  -10.972414 -b209.51804A-  1072.02^2 
-2599.2093A^  -  3313.8866A^ 
4-2146.6546A®  -  557.55271A®  (7) 

is  used  for  the  subsequent  iterations.  F"{\)  provides 
negative  gains  in  the  range  er  €  (0,0.28],  which  covers 
cr  e  (0,0.2].  F"{X)  is  used  for  another  nine  generalized 
Landweber  iterations  to  ensure  that  sufficient  attenua¬ 
tion  or  removal  of  high  frequency  components  has  been 
achieved. 

Now,  if  F(X)  is  used  for  the  first  three  generalized 
Landweber  iterations,  and  F''{X)  is  for  the  subsequent  nine 
iterations,  the  resulting  response  after  these  twelve  itera¬ 
tions  is  similar  to  the  result  of  a  low-pass  filtering  with  a 
transition  band  having  low  cutoff  frequency  at  <TjQ  =  0.19 
and  high  cutoff  frequency  at  ctjjJ  =  0.08.  Fig.  3  shows  the 
resulting  low-pass  filter  after  the  twelve  iterations.  The 
transition  band  cutoff  frequencies  <tjq  and  rrjjj  were  chosen 
at  the  frequencies  with  gains  0.95  and  0.05,  separately.  In 
general,  this  type  of  shaping  is  also  referred  to  as  roll-off 
shaping  [1]. 


Figure  3:  The  resulting  filter  after  the  twelve  iterations. 

We  used  this  method  to  simulate  the  reconstruction  of 
tile  complex  phantom  in  the  simulated  PET-3  geometry 
rom  noisy  data  [3].  The  number  of  total  counts  was  1 
®illion;  the  data  noise  was  Poisson  in  nature.  Fig.  4  shows 
the  result  of  the  third  and  the  twelfth  iterations.  Compare 
t  e  third  and  the  twelfth  reconstructed  images  of  Fig.  4. 

‘  8  expected,  the  twelfth  reconstructed  image  is  smoother 
han  the  third  image. 


An  important  feature  of  the  generalized  Landweber  it¬ 
eration  can  also  be  seen  in  Fig.  4  —  not  only  have  we  ob¬ 
tained  the  reconstructed  images,  but  we  also  have  a  known 
relation  between  a  reconstructed  image  and  the  shaping 
filter  generating  the  image.  This  known  relation  is  unique 
to  the  generalized  Landweber  iteration;  it  is  not  present 
in  the  ART  or  CG  iteration.  Since  F{X)  and  F''{X)  are 
both  polynomial  functions  of  order  6,  84(=  (6  -I-  1)  x  12) 
forward  and  backward  projections  are  known  to  be  needed 
to  obtain  the  desired  image  characteristics. 

In  examining  the  images  in  Fig.  4,  note  the  change  due 
to  the  reconstruction  and  removing  of  the  high  frequency 
components  in  <7  G  (0,0.2].  The  deterioration  of  the  im¬ 
age  after  the  first  three  iterations  was  due  to  the  partial 
reconstruction  of  these  high  frequency  components. 

C.  Computational  Comparison 

We  compare  the  computational  requirements  for  a  roll-off 
shaping  filter  (see  the  previous  subsection)  and  a  trun¬ 
cated  shaping  filter  as  in  TIF.  The  major  difference  is 
that  the  roll-off  shaping  can  be  iteratively  performed,  while 
the  truncated  shaping  must  be  computed  directly  using  an 
SVD;  this  requires  an  enormous  amount  of  computation. 

Direct  computation  of  z*  requires  storage  of  the  matri¬ 
ces  U  and  V  and  a  considerable  amount  of  computation  to 
implement  (3),  since  U  and  V  (unlike  A)  are  not  sparse. 
It  also  requires  an  (off-line)  computation  of  the  SVD  of  A. 
Consider  a  4000  x  4000  system  matrix  A,  and  assume  using 
four  bytes  to  store  a  number  in  floating-point  format.  Then 
128  million  (=  4000  x  4000  x  4  x  2)  bytes  will  be  required 
just  to  store  U  and  V!  On  the  other  hand,  if  A  is  97% 
sparse  (3%  of  the  elements  of  A  are  non-zero),  the  compu¬ 
tation  time  for  obtaining  z*  in  (3)  is  equivalent  to  about 
33  (ss  2/0.06)  forward  and  backward  projections  since  the 
sparse  structure  of  A  can  be  utilized  in  the  iteration.  If  k 
in  (3)  is  chosen  to  preserve  half  of  the  singular  values,  then 
17  forward  and  backward  projections  are  required  for  the 
computation. 

In  some  applications,  the  system  matrix  A  becomes 
sparser  as  its  size  increases.  For  example,  a  150-million- 
elements  system  matrix  in  [5]  has  about  2  million  non¬ 
zero  elements,  so  that  it  is  98.7%  sparse.  In  this  case, 
an  SVD  computation  will  become  almost  impossible.  The 
computation  time  for  obtaining  z*  is  equivalent  to  about 
75(«  2/0.026)  forward  and  backward  projections.  If  half 
of  the  singular  vadues  are  preserved,  about  37  forwaird  and 
backward  projections  will  be  needed.  Hence  even  apart 
from  the  storage  problem  and  the  difficulty  in  computing 
the  SVD  of  A,  the  iterative  approach  of  roll-off  shaping  be¬ 
comes  more  favorable  over  the  direct  approach  using  TIF 
as  the  size  and  spairsity  of  A  grows. 

IV.  Summary 

A  new  variable  shaping  matrix  method  for  using  the  gen¬ 
eralized  Landweber  iteration  in  image  reconstruction  in 
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(a)  (b)  (c) 

Figure  4:  The  reconstructed  images  from  (b)  the  third  and  (c)  the  twelfth  iterations,  (a)  is  the  complex-phantom  image. 


emission  tomography  has  been  developed.  The  method 
uses  two  shaping  matrices;  the  first  one  is  for  fast  recov¬ 
ery  of  low  frequency  components;  and  the  second  one  is 
either  for  the  acceleration  of  the  reconstruction  of  high 
frequency  components  if  the  noise  does  not  impose  a  prob¬ 
lem  in  recovering  the  high  frequency  components,  or  for 
the  attenuation  of  high  frequency  components  when  noise 
is  a  problem. 

Examples  of  using  a  variable  shaping  matrix  to  acceler¬ 
ate  the  reconstruction  of  high  frequency  components  and 
to  attenuate  high  frequency  components  were  presented. 
The  savings  in  computation  and  storage  space  in  using  a 
variable  shaping  matrix  instead  of  TIF  was  also  presented; 
it  results  from  the  sparsity  of  A,  which  generally  increases 
as  the  size  of  A  increases.  Hence  the  larger  the  system,  the 
greater  the  advantage  of  the  new  method  over  TIF. 

An  important  topic  for  further  research  is  application  of 
the  generalized  Landweber  iteration  with  a  variable  shap¬ 
ing  matrix  to  linear-algebra-based  signal  processing  prob¬ 
lems.  The  solution  (3)  is  used  in  signal  restoration  prob¬ 
lems,  harmonic  retrieval,  and  deconvolution.  The  method 
proposed  in  this  paper  can  be  applied  to  all  of  those  prob¬ 
lems,  even  if  the  matrix  is  not  sparse,  the  computation 
required  is  less  than  that  for  a  SVD.  Another  topic  is  fur¬ 
ther  study  on  designing  the  shaping  matrix  to  achieve  a 
given  specification  of  low  and  high  cutoff  frequencies  with 
the  least  amount  of  computation. 
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Acceleration  of  Landweber-Type  Algorithms  by 
Suppression  of  Projection  on  the 
Maximum  Singular  Vector 
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Abstract — We  develop  a  new  procedure  that  speeds  up  con¬ 
vergence  during  the  initial  stage  (the  first  100  forward  and 
backward  projections)  of  Landweber-type  algorithms,  iterative 
image  reconstruction  for  PET,  which  include  the  Landweber, 
generalized  Landweber,  and  steepest  descent  algorithms.  The 
procedure  first  identifies  the  singular  vector  associated  with  the 
maximum  singular  value  of  the  PET  system  matrix,  and  then 
suppresses  projection  of  the  data  on  this  singular  vector  after 
a  single  Landweber  iteration.  We  show  that  typical  PET  system 
matrices  have  a  significant  gap  between  their  two  largest  singular 
values;  hence,  this  suppression  allows  larger  gains  in  subsequent 
iterations,  speeding  up  convergence  by  roughly  a  factor  of  three. 
New  contributions  of  this  paper  include:  1)  study  of  the  singular 
value  spectra  of  typical  PET  system  matrices,  2)  study  of  the  effect 
on  convergence  of  projection  on  the  maximum  singular  vector, 
and  3)  study  of  the  convergence  behavior  of  the  new  procedure 
applied  to  the  Landweber,  generalized  Landweber,  steepest  de¬ 
scent,  conjugate  gradient,  and  ART  algorithms  (comparison  is 
also  made  with  the  MLEM  algorithm). 

I.  Introduction 

Recently,  there  has  been  growing  interest  in  using  iter¬ 
ative  reconstruction  algorithms  in  tomographic  imaging 
[l]-[7].  In  using  an  iterative  reconstruction  algorithm,  the 
presence  of  noise  from  the  imaging  process  and  roundoff  error 
requires  imposing  a  stopping  rule  to  stop  the  iteration  before 
the  statistical  noise  in  the  reconstructed  image  becomes  too 
big  [3]-[10],  or  incorporating  a  regularization  strategy  into 
the  algorithm  [11]-[13]. 

Most  research  has  focused  on  the  following  iterative 
reconstruction  algorithms;  algebraic  reconstruction  tech¬ 
nique  (ART)  [14],  [15],  maximum-likelihood-expectation- 
maximization  (MLEM)  iteration  [16],  steepest  descent  (STP) 
algorithm  [13],  [17],  and  conjugate  gradient  (CG)  algorithm 
[13],  [17],  A  problem  with  these  algorithms  is  that  there  is 
no  clear  formula  to  describe  what  has  been  achieved  after 
some  number  of  iterations.  Each  ART  iteration  modifies  the 
reconstructed  image  by  projecting  from  one  hyperplane  to 
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another  hyperplane  defined  by  a  system  of  linear  equations; 
MLEM  iteration  maximizes  a  likelihood  function;  and  each 
STP  and  CG  iteration  searches  adaptively  for  the  largest 
gradient  defined  by  the  least-square  error  between  the 
projection  data  and  the  estimated  projection  data.  In  each  case, 
the  precise  meaning  of  the  image  following  each  iteration  is 
unclear. 

A  solution  to  this  problem  is  to  use  a  more  controllable 
algorithm  for  which  the  reconstruction  process  can  be  exam¬ 
ined  analytically.  Using  the  Landweber  [18]  or  the  generalized 
Landweber  iteration  [19],  [20],  one  knows  exactly  what  singu¬ 
lar  value  spectral  components  are  recovered  in  each  iteration. 
The  generalized  Landweber  iteration,  with  a  proper  shaping 
matrix,  can  accelerate  and  regularize  the  reconstruction  process 
[20];  the  number  of  iterations  needed  and  the  convergence 
behavior  can  be  designed  before  the  iteration  begins,  and 
computation  time  can  be  estimated  accurately. 

In  this  paper  we  propose  a  new  method,  which  we  call 
the  DC -suppression  procedure,  for  improving  the  conver¬ 
gence  rate  of  the  Landweber-type  iterations,  which  include  the 
Landweber,  generalized  Landweber,  and  STP  algorithms.  The 
“DC  component”  of  an  image  is  defined  here  as  its  component 
along  the  singular  vector  associated  with  the  system  matrix’s 
maximum  singular  value.  We  first  demonstrate  numerically 
that  typical  PET  system  matrices  have  a  significant  gap  be¬ 
tween  their  two  largest  singular  values  (a  rigorous  proof  was 
provided  in  Johnstone  and  Silverman  [21]).  Also,  the  DC 
component  of  an  image  can  be  completely  recovered  by  the 
first  Landweber  iteration.  By  suppressing  this  component  in 
subsequent  iterations,  the  gain  factor  in  the  iteration  can  be 
increased,  speeding  up  the  convergence  of  the  other  image 
components.  For  practical  applications,  we  will  only  investi¬ 
gate  the  reconstruction  speed  of  various  algorithms  in  the  first 
100  forward  and  backward  projections. 

n.  System  Equation 

The  system  equation  [15],  [16]  that  describes  the  transfor¬ 
mation  or  projection  process  in  tomographic  imaging  (e.g., 
SPECT  or  PET)  is  usually  represented  as 

Ax  =  b  (1) 

where  .4  is  an  m  x  rr  system  matrix,  which  describes  the  system 
geometry,  x  is  an  n  x  1  vector  of  the  image  pixels,  and  6  is  an 
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m  X  1  vector  of  the  measured  projections  of  the  image.  The 
matrix  .4  is  typically  very  sparse  [5]. 

There  are  several  problems  associated  with  the  solution 
of  (1): 

1)  the  size  of  the  system  —  typically  m  and  n  are  on  the 
order  of  several  thousand  [5],  [16]; 

2)  ill-conditioning  of  the  system — typically  A  has  a  large 
condition  number  [22],  so  that  a  small  change  in  the  projection 
data  b  may  cause  a  large  change  in  the  solution  .r  or  in  the 
minimum-norm  least-squares  solution  .r*. 

.T-  =  .4+6  (2) 

where  .4+  is  the  pseudoinverse  of  ,4  (see  (3)  below)  [23]; 

3)  Ill-posedness — typically  A  may  have  a  nonempty  null 
space,  so  that  some  components  of  the  image  cannot  be 
recovered  from  the  projection  6  without  additional  information. 

The  notations  in  singular  value  decomposition  (SVD)  [20], 
[23]  are  adopted  here  for  the  convenience  of  discussion.  The 
SVD  of  A  is  .4  =  VZV^.  Here  U  =  [u1.u2-  --.Wm] 
and  V  =  [ni, U2,  •  •  • .  T’n]  are  orthogonal  matrices,  and  E 
is  “diagonal”  in  that  (S);  ^  =  0  unless  i  =  j,  in  which 
case  (S)i  1  =  CTi,  the  singular  values  of  A.  Without  loss  of 
generality,  let  tri  >  <72  >  ■  •  ■  >  o-p(.4)  >  0,  where  p{A)  is 
the  rank  of  A.  It  can  be  shown  [23]  that  the  minimum-norm 
least-squares  solution  to  (1)  is 


I*  =  I'diag  — 


0---0 


P(A) 


t=l 


^p{A) 

where  (6.  Ui)  =  is  the  inner  product  of  vectors  b  and  u^. 


U'^b=  ^  —{b,Ui)v, 

(3) 


III.  Landweber-Type  Reconstruction  Algorithms 

Throughout  this  section,  we  assume  that  A  has  maximum 
singular  value  cti  =  1.  We  relax  this  assumption  in  Section  IV. 
While  some  of  this  material  is  taken  from  [18],  [19],  and  [20], 
we  make  some  impiortant  new  points  at  the  end. 

A.  Definition 

We  define  a  Landweber-type  iteration  as  an  iteration  of  the 
form 


=x^-bQ*.DA^(6- Ax*)  (4) 

where  is  a  scalar  and  x*  is  the  reconstructed  image  after 
the  fcth  iteration.  Equation  (4)  becomes  the  following: 

1)  the  Landweber  iteration  [18]  when  q*.  =  q  (a  constant) 
and  D  =  I  (identity  matrix): 

x*+' =  X* -t-QA^(6- Ax*);  (5) 

2)  the  generalized  Landweber  iteration  [19]  when  =  a 
(a  constant)  and  D  is  a  shaping  matrix  (a  polynomial  function 
of  aA^.4,  see  (11)  and  (12)  below): 

x*++  =x*-bQZ)A^(6- Ax*):  (6) 


3)  the  STP  iteration  when  D  =  I  and  a*,  adaptively  changes 
at  each  iteration: 

■T  ^  =  X*  -f-  Q*- A^  (6  —  .4x*)  =  X*  -r 

Qk  =  .4^(6-  .4x*).  nk  =  - - .  (7) 

[Aijk)  A(]k 

For  simplicity,  we  will  refer  to  both  a  and  as  gain  factors. 

The  results  of  [18],  [19],  and  [20]  summarized  below  all 
assume  that  the  maximum  singular  value  of  .4  is  tti  =  1; 
this  is  why  we  also  make  that  assumption  in  this  section.  In 
Section  IV,  we  relax  this  assumption:  this  is  why  we  include 
the  gain  o  [which  scales  A^.4  in  (5)  and  (6)].  For  the  rest 
of  this  section,  assume  o  =  1  unless  stated  otherwise;  we 
continue  to  exhibit  a  for  later  convenience. 

When  initialized  with  zero,  all  three  Landweber-type  it¬ 
erations  will  converge  to  the  minimum-norm  least-squares 
solution  (3)  [17],  [20],  provided  ||a:A^.4||2  <  2  [18],  [24] 
(for  the  Landweber  iteration)  or  ||qD.4^.4||2  <  2  [19],  [25] 
(for  the  generalized  Landweber  iteration). 


B.  Convergence  and  Filtering  Control  of  the  Generalized 
Landweber  Iteration 


Define  the  components  of  an  image  as  its  projections  on  the 
singular  vectors  u,.  These  are  analogous  to  (but  not  the  same 
as)  Fourier  frequency  components,  with  “high  frequencies”  as¬ 
sociated  with  small  singular  values  Oi,  and  “low  frequencies” 
associated  with  large  singular  values  [20],  Similar  arguments 
were  adopted  also  in  Barret  et  al.  [26]  and  Smith  et  al. 
[22],  This  terminology  is  common  (e.g.,  [12]),  although  these 
frequencies  coincide  with  Fourier  frequencies  only  in  a  special 
circumstance  [10].  In  the  sequel,  the  image  component  on  the 
singular  vector  associated  with  cti  is  referred  to  as  the  “DC 
component,”  since  it  has  the  smallest  “frequency.” 

Let  the  initial  condition  be  zero.  In  the  generalized  Landwe¬ 
ber  iteration,  the  definition  of  D  will  be  complete  if  the 
image  Dvt  of  each  singular  vector  Vi  in  R{A^)  (range  space 
of  A^)  is  specified  [19],  [20],  Strand  [19]  proposed  that 
the  matrix  D  could  be  designed  by  specifying  the  scalars 
Pir  ■  •  ,Pp{A)  in 

Dv,  =  piVi,  0  <  p.trf  <  2,  (8) 


where  the  condition  0  <  p,crf  <  2  ensures  the  convergence 
of  the  generalized  Landweber  iteration.  The  reconstructed 
image  x*  after  the  /cth  generalized  Landweber  iteration  can 
be  represented  as  [20] 


P(A) 

E 

i=l  ■- 


i-(i 


Pi(^, , 


(b,Ui)vi. 


(9) 


Note  that  the  gain  after  k  iterations  (compare  (3)  and  (9)) 


G{ai,k)=  l-{l-picrfy 


(10) 


illustrates  the  control  over  both  convergence  and  filtering  that 
is  possible  in  the  generalized  Landweber  iteration.  By  varying 
the  pfs,  it  is  possible  to  control  the  convergence  rates  of  each 
component  of  the  image  independently.  And  by  varying  k,  the 
extent  of  convergence  can  be  controlled — some  components 


PAS  AND  YAGLE:  ACCELER.ATION  OF  LANDWEBER-TYPE  ALGORITHMS 

of  the  image  can  be  partially  filtered  out  by  stopping  the 
iteration  early.  The  latter  control  is  also  available  in  the 
Landweber  iteration. 

C.  Examples — Three  Choices  for  D 

We  now  demonstrate  the  convergence  behavior  of  the  gen¬ 
eralized  Landweber  iteration  for  three  different  choices  of  the 
shaping  matrix  D.  First,  choose  D  =  I,  so  that  p,  =  1  and 
the  generalized  Landweber  iteration  reduces  to  the  Landweber 
iteration.  Equation  (10)  makes  it  clear  that  high-frequency 
components  of  the  image  (those  associated  with  small  a,) 
converge  more  slowly  than  low-frequency  components  (those 
associated  with  large  cr,).  Indeed,  since  ai  =  1,  the  DC 
component  converges  (i.e.,  is  completely  recovered)  after  a 
single  Landweber  iteration.  This  fact  plays  a  vital  role  below. 

Second,  let  D  be  specified  by  choosing  p,  =  ^  0 

in  (9).  Then  the  reconstructed  image  after  the  first  iteration 
is  identical  to  the  minimum-norm  least-squares  solution  (3)! 
However,  this  choice  is  not  practical,  since  a  lengthy  SVD 
computation  of  A  would  be  needed  to  obtain  the  p,.  The  third 
choice  of  D  is  from  [19],  in  which  D  is  given  as  a  polynomial 
function  F(-)  of  aA^ A,  i.e.,  D  =  F[aA'^ A)  where 

F{\)  =  31.5  -  315A  -b  1443. 75A2  -  3465A^  +  4504.5A^ 

-  3003A"+ 804.375 A®.  (11) 

The  polynomial  function  F(A)  is  chosen  so  that  AF(A)  is 
a  good  approximation  to  the  unit  step  function  in  the  range 
A  e  [0, 1].  This  choice  results  in  (see  (8);  assume  q  =  1) 

Dvi  =  F[A^  A)vi  =  F[a'f)v,  =  piv^  (12) 

implying  F(crf)  =  p;,  so  that  (11)  effectively  chooses  picrf  a: 

1  to  achieve  the  minimum-norm  least-squares  solution  faster 
[see  (9)].  In  the  sequel,  we  make  this  choice  of  D  in  the 
generalized  Landweber  iteration. 

These  convergence  properties  of  the  Landweber  and  the 
generalized  Landweber  iterations  are  illustrated  in  Fig.  1(a) 
and  (b),  respectively.  It  is  clear  for  both  algorithms  that  the 
low-frequency  components  (larger  singular  values)  converge 
faster  than  the  high-frequency  components  (smaller  singular 
values) — the  farther  a  component’s  singular  value  is  from 
one.  the  slower  it  converges.  This  is  called  the  nonuniform 
convergence  property  in  [20]. 

D.  Effect  of  DC  Component  on  Convergence 

Although  the  DC  component  converges  after  a  single 
Landweber  iteration.  Fig.  1(a)  makes  it  clear  that  other 
components  in  the  Landweber  iteration  converge  more  slowly. 
Furthermore,  the  smaller  Oi  is  (relative  to  a\  =  1),  the  slower 
the  image  component  on  will  converge.  For  the  generalized 
Landweber  iteration.  Fig.  1(b)  shows  that  the  low-frequency 
components  (associated  with  ct,  >  0.2)  all  converge  quickly, 
but  the  high-frequency  components  (associated  with  Ui  <  0.2) 
converge  more  slowly. 

Since  the  DC  component  converges  immediately,  it  might 
seem  tempting  to  make  q  >  1  in  (5)  and  (6).  Since  D  = 
F(q.4^.4),  this  scales  all  of  the  singular  values  of  A  by 
bringing  the  smaller  ones  closer  to  one,  so  that  their  associated 


smgulor  voiue 

(a) 


singulor  value 

(b) 


Fig.  1.  The  convergence  propenies  of  (a)  the  Landweber  and  (b)  the  gener¬ 
alized  Landweber  iterations.  These  curves  were  derived  from  Gia.  k).  with  k 
ranging  from  1  to  30,  for  the  Landweber  iteration  (p,  =:  1 )  and  the  generalized 
Landweber  iteration  (p,  =  F^crfj).  The  first,  second,  and  third  iteration  plus 
the  last  iteration  is  labeled. 

components  converge  faster.  Since  the  condition  for  stability  in 
the  generalized  Landweber  iteration  is  ||qFA^A||2  <  2,  there 
seems  to  be  a  good  margin.  However,  D  —  F(aA^A)  is  a 
matrix  polynomial  of  degree  p,  each  term  having  eigenvectors 
Vi,  and  assuming  that  the  highest-degree  term  dominates 
results  in 

WaDA^AW  «  cF+^\\A^A\\  =  <  2.  (13) 

Hence,  if  q  >  ’’^v/2  then  the  generalized  Landweber  iteration 
will  diverge.  For  the  Landweber  iteration  (p  =  0),  q  >  2 
makes  the  iteration  diverge.  In  particular,  p  =  6,  if  the 
polynomial  function  (11)  is  used,  implies  that  a  >  </2  = 
1.104  will  make  the  generalized  Landweber  iteration  diverge; 
hence  increasing  a  significantly  is  not  a  viable  option.  Indeed, 
due  to  roundoff  error,  it  is  prudent  to  bound  q  by  0  <  q  <  1 
in  the  generalized  Landweber  algorithm.  In  the  Landweber 
algorithm,  a  may  be  as  large  as  2  without  endangering 
stability,  but  increasing  a  will  slow  down  the  convergence 
of  low-frequency  components  [let  p^  =  1  and  see  (10)]. 
Therefore,  a  is  usually  set  equal  to  one  [19]. 

To  see  precisely  what  happens  when  q  >  1,  recall 
from  (12) 

aDA^ Avi  =  F[aA'^  A)  {^aA^  A)  V,  —  [aa'^)F[a(Tf)  v, 

=  AF(A)ui,  A  =  acT^.  (14) 


C.2  0^  0.6  3,8 

norrrci  ZeC  S^HQuiO''  vC.uC 


Fig,  2,  The  singular  value  histogram  of  the  system  matrix  A].  Note  the 
singular  values  have  been  normalized  or  divided  by  their  largest  singular 
value. 


X 


Fig.  3.  The  flowchart  of  the  DC-suppression  procedure. 

But  XF{X)  a:  1  only  for  A  €  [0, 1];  outside  this  interval 
AF(  A)  ^  [0.2]  and  the  algorithm  diverges.  Hence  stability 
requires  that  A  =  aa^  <  1,  so  that  q  <  ^  for  all  i,  which 
requires  q  < 

IV.  The  New  DC-Suppression  Procedure 

We  now  relax  the  assumptions  that  the  maximum  singular 
value  a  I  of  A  is  one,  and  that  a  =  1.  It  is  clear  that  choosing 
Q  =  ^  reduces  this  general  case  to  the  case  where  the 
maximum  singular  value  of  A  is  one,  since  D  =  F[aA'^A) 
and  Q  multiplies  A'^ A  in  (5)  and  (6).  Then  all  of  the  results 
of  Section  III  apply. 

.4.  Motivation 

Three  elements  motivated  the  development  of  the  new  DC- 
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suppression  procedure.  First,  the  gain  factor  n  is  effecti\'el\ 
bounded  by  -K  (see  above),  and  Fig.  1(a)  and  (10)  show 
that  the  DC  component  of  the  image,  associated  with  r^^, 
is  recovered  after  a  single  Landweber  iteration.  This  DC 
component  impedes  the  speed  of  reconstruction  of  the  other 
components,  by  limiting  the  gain  factor  q.  However,  if  the 
DC  component  were  removed  from  the  iteration,  the  gain 
factor  fi  could  be  increased  to  without  causing  instability. 
This  would  speed  up  the  reconstruction  of  the  other  frequency 
components,  since  their  associated  singular  values  would  be 
scaled  closer  to  one. 

Second,  the  system  matrix  .4  for  PET  systems  has  all  non- 

negative  elements,  as  does  .4^.4.  Such  matrices  tend  to  have 

a  large  gap  between  their  two  largest  singular  values  oi  and 

(To;  bounds  on  the  ratio  ^  have  been  given  in  [27], 

Hence  we  expect  oi  to  be  significantly  larger  than  (7  2  -  (^.3  .  ■  *  ' . 

This  phenomenon  was  observed  in  analyzing  the  distribution 

of  the  singular  values  of  the  system  matrix  Ai  in  [20],  /li 

was  the  195  x  144  system  matrix  describing  a  synthetic  PET 

geometry  with  12  x  12  pixels  centered  in  a  detector  ring  of 

radius  1,  which  hosted  26  identical  detectors  around  the  ring 

circle.  Fig.  2  shows  the  histogram  of  the  singular  values  of 

Ai;  there  is  a  large  gap  between  the  two  largest  singular  values 

(cTi  =  1.0433.(72  =  0.61995,  the  ratio  =  ^  =  1.683).  The 

singular-value  spectra  of  larger  system  matrices  also  exhibits 

this  phenomenon;  see  Section  V.  Since  the  singular  values 

except  (Ti  are  close  together  (see  Fig.  2,  and  compare  with 

the  gap  between  (Ti  and  172),  to  derive  (73  and  113  will  not  be 

easy;  there  is  not  much  one  can  gain  from  deriving  <73  and  ('3. 

Finally,  to  remove  the  recovered  DC  component  from  the 

reconstructed  image  after  the  first  Landweber  iteration, 

we  need:  1)  the  maximum  singular  value  ax  of  the  system 

matrix  A,  to  set  a  1  =  -V:2)  the  singular  vector  r;i,  so 

^1 

that  the  projection  of  on  vi  can  be  removed 

from  subsequent  Landweber-type  iterations;  and  3)  the  second- 
largest  singular  value  (72,  so  that  Q2  =  ^  >  Qi  can  be 
used  in  the  subsequent  Landweber  or  generalized  Landweber 
iterations,  speeding  up  the  convergence  of  the  other  image 
components  by  scaling  their  associated  singular  values  closer 
to  one.  Note  that  the  STP  iteration  does  not  use  02- 

B.  Determination  of  Oi .  Vi,  and  02.'  The  Power  Method 

The  power  method  [28]  is  used  to  calculate  cti  and  vx.  After 
determining  vi,  the  method  with  a  purification  procedure  [28] 
can  be  used  to  determine  (72  and  V2.  The  power  method  is  an 
iterative  method;  convergence  of  the  iteration  for  deriving  ox 
and  Vx  depends  on  the  ratio  ^ — the  larger  is,  the 

quicker  the  convergence  will  be.  Since  r„  is  relatively  large  for 
PET  system  matrices,  the  power  method  converges  relatively 
quickly.  The  initial  conditions  for  finding  vx  and  V2  are 
[1. 1,  •  •  • ,  1]  (all  I’s)  and  [1, 1,  •  •  ■ ,  1,  -1,  -1,  •  •  ■ ,  -1]^,  (half 
I’s  and  half  -I’s),  respectively,  since  they  are  geometrically 
similar  to  the  DC  and  first  harmonic  components  in  a  LSI 
system. 

C.  The  New  DC -Suppression  Procedure 

The  new  DC-suppression  procedure  is  as  follows. 
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Fie.  4.  (E)  The  sinsular  vector  ir  and  (b)  its  3-D  shade  plot.  Note  the  openings  in  (b)  are  only  for  3-D  display  purpose;  the  real  image  does  not 

have  these  openings. 
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veci 


(a)  (b) 

Fig.  6.  (a)  The  Shepp-Logan  phantom  and  (b)  the  complex  phantom. 

1)  Compute  (Ti-  I'l.  and  172  using  the  power  method. 

2)  Set  (»i  =  ;^  and  perform  a  single  Landweber  iteration. 


yielding  t'. 

3)  Compute  the  DC  component  of  the  image  j-qc  = 

{x^.vi)vi. 

4)  Suppress  the  DC  component  of  the  image  by  subtracting 
xdc  from  x^. 

5)  Set  02  =  ^  for  the  subsequent  Landweber  or  general¬ 
ized  Landweber  iterations. 

6)  At  each  subsequent  Landweber  or  generalized  Landwe¬ 
ber  iteration,  suppress  the  DC  component  of  the  image 
x'"  by  subtracting  (x*".!’!)!'!  from  x^. 

7)  After  termination,  add  xqc  to  x*"''"”'. 

Step  6)  is  necessary  due  to  computational  roundoff 
error — theoretically  there  should  be  no  component  of  x*'  on 
t’l,  but  roundoff  error  will  create  a  very  small  component  in  the 
(generalized)  Landweber  iteration,  which  accumulates  as  the 
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Fig,  '.  The  number  of  forward  and  backward  projections  needed  for  the  recovery  of  95^F  of  components  on  various  singular  vectors.  Without  loss  of 
generality,  the  largest  singular  value  is  assumed  to  be  1.  The  symbol  DC  after  Landweber  means  Landweber  with  DC  suppression:  the  G-Landweber 
means  generalized  Landweber.  Note  that  r„  =  1.C8. 
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Fig.  8.  The  performance  of  reconstructing  the  Shepp- Logan  phantom  using  (a)  the  Landweber  iteration,  the  generalized  Landweber  iteration,  and  their 
DC  suppression  implementations,  (b)  ART  iterations  with  zero,  average,  and  DC  suppression  implementation,  (c)  STP  iteration,  CG  iteration,  and  their  DC 
suppression  implementations,  and  (d)  the  fastest  methods  in  the  previous  three  figures  (one  from  each  figure)  and  MLEM  iteration. 


iteration  proceeds.  The  DC-suppression  procedure  is  illustrated 
by  a  flowchart  in  Fig.  3. 

V.  Numerical  Experiments  and  Discussions 

A.  Data  Preparation 

The  PET  geometry  in  [29]  (similar  to  that  of  [16]),  with 
128  detectors  and  a  square  array  of  4096(=  64  x  64)  pixels 
circumscribed  by  the  detector  ring,  was  used.  The  object 


support  was  assumed  to  be  in  the  inscribed  circle  of  the  square 
array,  accounting  for  3228  pixels  (determined  by  counting 
center  positions  of  pixels  inside  the  circle).  A  tube  is  defined 
by  any  two  different  detectors  [16];  some  tubes  defined  in  this 
way  may  not  intersect  any  pixel  of  the  image  at  all,  and  these 
were  omitted  to  reduce  computation  time.  Theoretically,  there 
were  8128  =  (128  x  127/2)  different  tubes  for  this  geometry; 
after  omitting  tubes,  this  was  reduced  to  4160  tubes.  Hence 
the  corresponding  system  matrix  A2  was  4160  x  3228. 
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Fig.  9.  The  performance  of  reconstructing  the  complex  phantom  using  (a)  the  Landweber  iteration,  the  generalized  Landweber  iteration,  and  their  DC 
suppression  implementations,  (b)  ART  iterations  with  zero,  average,  and  DC  suppression  implementation,  (c)  STP  iteration,  CG  iteration,  and  their  DC 
suppression  implementations,  and  (d)  the  fastest  methods  in  the  previous  three  figures  (one  from  each  figure)  and  MLEM  iteration. 


We  assume  the  conditional  probability  P{t/t)  of  an  annihi¬ 
lation  occurring  at  pixel  i  and  its  consequent  coincidence  event 
being  detected  by  tube  t  is  proportional  to  the  joint  angle  of 
pixel  /  to  tube  t.  which  is 

_  joint  angle  in  radians  from  pixel  i  to  tube  t 

[t/i)  -  —  . 

(15) 

A  normalization  [16]  makes  ~  ^  where  f*  is  the 

total  number  of  tubes  intersecting  pixel  i.  The  system  matrix 
-At  is  defined  by  having  (.At),  ,  =  P{tli)  after  normalization. 

The  singular  vectors  fj  and  tt  of  .At  are  shown  in  Figs.  4 
and  5.  respectively.  I’l  looks  roughly  like  the  DC  component  in 
a  LSI  system;  ('2  looks  like  the  first  harmonic  in  a  LSI  system. 
I’l  and  cTi  =  0.986703  were  computed  using  9  iterations  of 
the  power  method;  i't  and  ctt  =  0.587256  were  computed 
using  25  iterations  of  the  power  method  with  the  purification 
procedure.  The  ratio  r„  =  1.680  was  strikingly  close  to  the 
ratio  r„  =  1,683  of  Ai  (dimension  195  x  144),  due  to  simi¬ 
larity  between  the  geometries. 

If  one  considers  the  attenuation  correction  for  the  object, 
the  system  matrix  must  be  modified  each  time  when  imaging 
a  different  object.  However,  the  computation  using  the  power 
method  to  derive  the  singular  values  and  singular  vectors  can 
be  carried  out  in  parallel  with  the  data  acquisition  process.  So 
the  overhead  in  computation  can  be  minimized. 


The  Shepp- Logan  phantom  [30]  and  the  complex  phantom 
of  [31]  shown  in  Fig.  6(a)  and  (b),  respectively,  were  used  as 
the  test  objects  or  images.  Using  two  different  objects  avoided 
bias  in  interpreting  the  numerical  results. 

B.  Acceleration  of  Convergence  in  the  Landweber  and  the 
Generalized  Landweber  Iterations  for  r„  =  1.68 

To  show  how  the  DC-suppression  procedure  accelerates 
convergence,  let  =  1.68,  which  approximates  the  values  of 
r„  for  the  system  matrices  Ai  and  .At.  Define  multiplication  of 
.A  by  an  image  vector  (of  dimension  n)  as  a  forward  projection, 
and  multiplication  of  by  a  projection  data  vector  (of 
dimension  m)  as  a  backward  projection.  Most  computation 
in  iterative  reconstruction  algorithms  consists  of  these  two 
operations. 

The  computation  for  DC  suppression  consists  of  one  for¬ 
ward  and  backward  projection  (see  the  first  four  blocks  in 
Fig.  3).  A  single  Landweber  iteration  also  requires  one  forward 
and  backward  projection.  The  generalized  Landweber  itera¬ 
tion,  with  D  =  F[aA'^ A)  and  F(-)  a  polynomial  of  degree 
p,  requires  p  -1-  1  forward  and  backward  projections  [20],  In 
the  examples  to  follow,  F(-)  is  given  in  (11)  and  p  =  6. 

Fig.  7  shows  the  number  of  forward  and  backward  projec¬ 
tions  needed  to  recover  95%  of  image  components  on  various 
singular  vectors  when  —  1.68.  Without  loss  of  generality, 
the  largest  singular  value  is  assumed  to  be  1;  the  curves 


IEEE  TRANSACTIONS  ON  MEDICAL  IMAGING.  \  OL  i  !  NO  DECEMBER 


were  derived  from  Glrr.k).  Fig.  7  shows  that  after  about  22 
forward  and  backward  projections  ( 1  from  the  DC  suppression 
and  21  from  3  generalized  Landweber  iterations):  1)  the 
generalized  Landweber  iteration  recovers  more  components 
than  the  Landweber  iteration.  2)  the  (generalized)  Landweber 
iteration  with  DC  suppression  recovers  more  high-frequency- 
components  than  the  (generalized)  Landweber  iteration  alone, 
and  3)  the  generalized  Landweber  iteration  with  DC  suppre.s- 
sion  recovers  all  image  components  over  95ft  on  singular 
vectors  with  singular  values  greater  than  0.1;  to  achieve  this 
without  DC  suppression.  63  fonvard  and  backward  projections 
are  necessary. 

Hence  the  Landw'eber  and  generalized  Landweber  iterations 
without  DC  suppression  require  roughly  three  times  as  manv 
projections  to  reach  the  same  solution  as  the  iterations  with 
DC  suppression.  This  factor-of-three  speed-up,  together  with 
the  possibility  of  parallel  implementation,  suggests  greater 
application  of  these  algorithms  in  the  future. 

C.  Numerical  Experiments 

The  following  iterative  image  reconstruction  algorithms 
were  considered:  1)  MLEM  [16],  2)  ART  [15],  3)  STP 
[13],  4)  CG  [13],  5)  Landweber  [18],  and  6)  generalized 
Landweber  [19].  Computation  load  for  each  algorithm  was 
represented  as  the  number  of  forward  and  backward  projec¬ 
tions.  as  follows: 

1)  each  MLEM,  ART,  or  Landweber  iteration  requires  1 
forward  and  1  backward  projections, 

2)  each  STP  iteration  requires  2  forward  and  1  backward 
projections. 

3)  the  first  CG  iteration  requires  2  forward  and  1  backward 
projections;  each  succeeding  CG  iteration  requires  2 
forward  and  2  backward  projections,  and 

4)  each  generalized  Landweber  iteration  requires  7  forward 
and  7  backward  projections. 

In  addition,  the  DC-suppression  procedure  requires  an  ad¬ 
ditional  forward  and  backward  projection. 

Figs.  8  and  9  show  the  numerical  results  for  reconstruct¬ 
ing  the  Shepp-Logan  phantom  and  the  complex  phantom, 
respectively.  The  vertical  axis  represents  euclidean  distance 
between  the  original  image  and  the  reconstructed  image;  the 
horizontal  axis  represents  the  number  of  forward  and  backward 
projections.  The  iterations  were  initialized  with  zero,  unless 
otherwise  specified.  We  discovered  that  ART  with  average 
initial  condition  [20]  (every  pixel  is  set  equal  to  the  number 
of  total  counts  divided  by  the  number  of  pixels)  is  comparable 
to  ART  with  DC  suppression;  this  was  therefore  included 
for  comparison.  An  average  initial  condition  was  also  used 
in  MLEM,  but  DC  suppression  cannot  be  applied  to  MLEM 
since  a  reconstructed  image  will  then  have  negative  pixels. 
Results  for  the  first  1(K)  forward  and  backward  projections 
were  compared. 

D.  Discussion  of  Results 

Figs.  8  and  9  illustrate  several  points. 

1)  The  ART,  Landweber,  generalized  Landweber,  and  STP 
iterations  all  have  their  convergence  rates  accelerated  by  DC 
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Fig.  10.  The  change  of  the  gain  factors  in  STP  iteration  after  DC  suppression 
for  the  reconstruction  of  (a)  the  Shepp-Logan  phantom  and  (b)  the  complex 
phantom. 

suppression;  CG  does  not  have  its  convergence  accelerated; 
and  DC  suppression  cannot  be  used  for  MLEM. 

2)  The  generalized  Landweber  iteration  with  DC  suppres¬ 
sion  was  comparable  to  ART  with  DC  suppression  for  the 
complex  phantom  [see  Fig.  9(d)],  but  it  was  a  little  slower 
than  ART  for  the  Shepp-Logan  phantom  [see  Fig.  8(d)].  Both 
ART  and  the  generalized  Landweber  were  faster  than  CG  with 
DC  suppression,  and  MLEM  (for  about  the  first  40  forward 
and  backward  projections).  Although  CG  will  converge  to  the 
solution  in  a  finite  number  of  iterations  [32],  it  is  not  faster 
during  these  first  40  forward  and  backward  projections. 

3)  ART  with  average  initial  condition  was  comparable  to 
ART  with  DC  suppression  [see  Fig.  8(b)  and  Fig.  9(b)].  This 
indicates  that  by  setting  the  initial  condition  close  to  the 
solution,  ART  can  be  significantly  accelerated.  However,  an 
average  initial  condition  does  not  accelerate  the  convergence 
rate  of  the  (generalized)  Landweber  iteration  [20], 

ART  and  the  generalized  Landweber  iteration,  with  DC  sup¬ 
pression,  perform  similarly  in  terms  of  convergence  behavior, 
measured  vs.  number  of  projections.  However,  the  general¬ 
ized  Landweber  iteration  can  be  computed  in  parallel,  while 
ART,  a  ray -by-ray  iteration  [15],  is  not.  And  the  generalized 
Landweber  algorithm  allows  considerable  control  over  its 
convergence  behavior,  unlike  ART. 

Also  note  that  the  acceleration  of  STP  with  DC  suppression 
was  directly  related  to  the  large  gain  factors  Q/t  in  the  first 
several  iterations,  caused  by  DC  suppression.  Fig.  10  shows 
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this  phenomenon:  STP  with  DC  suppression  had  larger  gain 
factors  than  STP  iteration  without  it  for  the  first  10  forward 
and  backward  projections  for  the  Shepp-Logan  phantom,  and 
for  the  first  12  for  the  complex  phantom. 

VI.  SUMMARN' 

.4  new  DC-suppression  procedure  has  been  developed, 
and  shown  to  significantly  accelerate  convergence  of  the 
Landweber-type  iterations.  The  procedure  identifies  and  re¬ 
moves  the  DC  component  from  the  iteration,  permitting  larger 
gains  fu  and  accelerating  convergence. 

The  generalized  Landweber  iteration  with  DC  suppression 
was  comparable  to  ART  with  DC  suppression.  However, 
considering  the  feasibility  of  parallel  computing  of  the  forward 
and  the  backward  projections,  and  the  controllability  of  the 
generalized  Landweber  iteration,  the  generalized  Landweber 
iteration  may  be  favorably  compared  to  ART. 

Topics  for  further  research  include  further  study  of  the 
ratio  r„  for  different  system  matrices,  and  investigation  of 
the  possibility  of  using  different  shaping  matrices  to  speed  up 
the  generalized  Landweber  iteration  with  DC  suppression. 
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Abstract 

We  develop  a  new  procedure  that  speeds  up  the  conver¬ 
gence  during  the  initial  stage  (the  first  100  forward  and 
backward  projections)  of  Landweber-type  algorithms  for 
iterative  image  reconstruction,  which  include  the  Landwe- 
ber,  generalized  Landweber,  and  steepest  descent  algo¬ 
rithms.  The  procedure  first  identifies  the  singular  vector 
associated  with  the  maximum  singular  value  of  the  PET 
system  matrix,  and  then  suppresses  projection  of  the  data 
on  this  singular  vector  after  a  single  Landweber  iteration. 
We  show  that  typical  PET  system  matrices  have  a  signif¬ 
icant  gap  between  their  two  largest  singular  values;  hence 
this  suppression  allows  larger  gains  in  subsequent  itera¬ 
tions,  speeding  up  convergence  by  roughly  a  factor  of  three. 
New  contributions  of  this  paper  include:  1)  study  of  the 
singular  value  spectra  of  typical  PET  system  matrices;  2) 
study  of  the  effect  on  convergence  of  projection  on  the 
maximum  singular  vector;  and  3)  study  of  the  convergence 
behavior  of  the  new  procedure  applied  to  the  Landweber, 
generalized  Landweber,  steepest  descent,  conjugate  gradi¬ 
ent,  and  ART  algorithms  (comparison  is  also  made  with 
the  MLEM  algorithm). 

I.  Introduction 

Some  of  the  most  researched  algorithms  in  iterative  im¬ 
age  reconstruction  in  emission  tomography  are  maximum- 
likelihood  EM  (MLEM)  [1],  algebraic  reconstruction  tech¬ 
nique  (ART)  [2],  steepest  descent  (STP)  [3],  and  conjugate 
gradient  (CG)  [3].  One  characteristic  with  the  above  algo¬ 
rithms  is  that  they  are  all  based  on  optimization  strategies, 
which  lead  the  reconstruction  to  be  dependent  on  the  ob¬ 
ject,  such  that  no  formula  is  able  to  describe  what  has  been 
achieved  after  some  number  of  iterations. 

ART  modifies  the  reconstructed  image  by  projecting 
from  one  hyperplane  to  another  hyperplane  defined  by 

•The  authors  wish  to  thank  Dr.  W.  Leslie  Flogers  and  Mr.  Neal  H. 
Clinthome  of  the  Division  of  Nuclear  Medicine  of  the  University  of 
Michigan  for  many  helpful  discussions.  The  work  of  the  first  author 
was  supported  in  part  by  NIH  grant  ^POl-CA42768,  and  in  part 
by  a  Research  Partnership  award  from  the  H.  H.  Rackham  School 
of  Graduate  Studies  of  the  University  of  Michigan.  The  work  of  the 
second  author  was  supported  in  part  by  ONR  grant  #N00014-90-J-l 


a  system  of  linear  equations;  MLEM  m£iximizes  a  likeli¬ 
hood  function;  and  STP  and  CG  search  adaptively  for  the 
largest  gradient  defined  by  the  least-squares  error  between 
the  projection  data  and  the  estimated  projection  data.  In 
each  case,  the  precise  meaning  of  the  image  following  each 
iteration  is  unclear. 

One  alternative  is  to  use  the  generalized  Landweber  it¬ 
eration  [4] ,  a  method  which  can  guarantee  a  characterized 
property  of  the  reconstructed  image  after  a  number  of  iter¬ 
ations.  Using  this  iteration  [5],  the  reconstruction  process 
is  treated  as  a.  filtering  in  the  singular  space  (defined  by  the 
singular  values  and  singular  vectors  of  the  system  matrix) 
[4];  no  matter  what  the  object  is,  the  reconstructed  image 
possesses  a  predefined  characteristic  after  each  single  iter¬ 
ation,  which  can  be  designed  before  the  iteration  starts. 
Moreover,  with  a  proper  shaping  matrix,  the  iteration  can 
accelerate  and  regularize  the  image  reconstruction  process. 
Since  the  iteration  is  system  dependent,  not  object  depen¬ 
dent,  computation  time  can  be  estimated  accurately  for 
various  objects. 

We  applied  the  procedure  to  the  image  reconstruction  of 
a  positron  emission  tomography  (PET)  system.  The  pro¬ 
cedure  first  identifies  the  singular  vector  associated  with 
the  maximum  singular  value  of  the  PET  system  matrix, 
and  then  suppresses  the  projection  of  the  image  on  this 
singular  vector  after  a  single  Landweber  iteration.  The 
projection  is  defined  as  the  DC-component  of  the  image. 
We  show  that  typical  PET  system  matrices  have  a  signif¬ 
icant  gap  between  their  two  largest  singular  values  (this 
gap  is  inherent  in  most  matrices  with  non-negative  coeffi¬ 
cients  [6]);  hence  this  suppression  allows  larger  gain  factors 
in  the  subsequent  iterations,  speeding  up  convergence  by 
roughly  a  factor  of  three. 

This  paper  is  organized  as  follows.  Section  II  formu¬ 
lates  the  image  reconstruction  problem  as  the  solution 
to  a  large  system  of  linear  equations,  and  reviews  the 
Landweber-type  iterations.  Section  III  introduces  the  new 
DC-suppression  procedure.  Section  IV  presents  and  sum¬ 
marizes  numerical  results.  Comparisons  with  the  MLEM, 
ART  and  CG  iterations  (both  with  and  without  applying 
the  DC-suppression  procedure)  are  also  included.  Section 
SV  concludes  with  a  summary. 
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II.  Background 

A.  System  Equation 

The  system  equation  that  describes  the  transformation  or 
projection  process  in  a  PET  system  is  usually  represented 
as  a  linear  system  Ax  =  b,  where  A  is  an  m  x  n  system 
matrix,  which  describes  the  system  geometry,  x  is  an  n  x  1 
vector  of  the  image  pixels,  and  6  is  an  m  x  1  vector  of 
the  measured  projection  of  the  image.  There  are  several 
problems  associated  with  solving  the  system:  1)  typically 
m  and  n  are  on  the  order  of  several  thousand,  which  makes 
a  direct  (as  opposed  to  iterative)  approach  to  solve  it  very 
difficult;  2)  A  has  a  large  condition  number,  which  causes 
a  significant  change  in  the  solution  x*  =  A^b,  where  A^ 
is  the  pseudoinverse  of  A,  from  a  small  perturbation  in  6; 
and  3)  A  is  very  sparse  (about  2-3%  non-sparse)  [7]. 

B.  Landwebtr-Type  Algorithms 

We  define  a  Landweber-iype  iteration  as  an  iteration  of  the 
form 

i'=+i  =  x*+aiDA^(6-Ax*),  (1) 

where  at  is  a  scalar  and  x*  is  the  reconstructed  image  after 
the  ik-th  iteration.  The  iteration  becomes: 

1)  the  Landweber  iteration  [8]  when  at  =  a  (a  constant) 
and  D  —  I  (the  identity  matrix); 

2)  the  generalized  Landweber  iteration  [4]  when  at  =  a 
(a  constant)  and  D  is  the  shaping  matrix,  a  polyno¬ 
mial  function  of  otX^ A.  D  is  assumed  to  be  F(A  = 
aA^ A)  [4]  in  this  research: 

F{X)  =  31.5-315A-I-  1443.75A2-3465A3 

-f4504.5A^  -  3003A®  -I-  804.375A®;  (2) 

3)  the  STP  iteration  [3]  when  D  =  I  and  q*  changes 
according  to  the  largest  gradient  defined  by  the  least- 
squares  error  between  the  projection  data  and  the  es¬ 
timated  projection  data: 

=  X*  -f  atqt,  (3) 

For  simplicity,  we  refer  to  both  q  and  at  as  gain  factors. 
When  initialized  with  zero,  all  three  iterations  converge  to 
X*  provided  the  Euclidean  norm  ||a£>A^A||2  <  2  [4]  for  the 
(generalized)  Landweber  iteration,  whose  a  is  normally  set 


III.  The  dc-suppression  procedure 

It  was  observed  [5]  that  1)  the  DC-component  of  the  im¬ 
age  could  be  recovered  after  one  single  Landweber  iteration 
when  ct  =  ^;  2)  the  difference  in  magnitude  between  the 


two  largest  singular  values  ai  and  cr2  of  a  PET  system  ma¬ 
trix  is  large.  Fig.  1  shows  one  example  of  the  distribution 
of  the  singular  values  of  the  synthetic  PET  system  ma¬ 
trix  Ai  in  [9],  where  ai  is  scaled  to  1  and  the  other  cr’s  are 
scaled  accordingly.  In  this  example,  <ti  is  much  larger  than 
<T2  (<’’1  =  1.0433,  <T2  =  0.61995,  the  ratio  r„  =  ^  =z  1.683)- 
and  3)  the  speed  of  the  Landweber-type  iteration  is  lim¬ 
ited  if  the  DC-component  stays  in  the  iteration.  Therefore 
it  is  reasonable  that  after  applying  one  single  Landweber 
iteration  and  removing  the  DC-component  from  the  itera¬ 
tion,  the  Landweber-type  iteration  can  be  accelerated  by 
increasing  q  to  -Ij  (a  larger  gain!).  The  DC-component  will 
be  added  back  after  the  iteration  is  over.  It  was  shown  [5] 
that  <Ti  and  02  and  their  associated  singular  vectors  Vi  and 
t;2  can  be  derived  from  using  the  power  method  and  a  pu¬ 
rification  procedure  [10]  without  much  effort,  which  can  be 
carried  out  during  data  acquisition  process.  Fig.  2  shows 
the  flowchart  of  the  DC-suppression  procedure.  Note  that 
xdc  denotes  the  DC-component. 


normolized  singulor  volue 

Figure  1:  An  example  of  the  singular  value  histogram  of  a 
PET  system.  Note  the  singular  values  have  been  normal¬ 
ized  or  divided  by  their  largest  singular  value.  There  are 
21  singular  values  very  close  to  zero. 

To  show  how  the  DC-suppression  procedure  accelerates 
convergence,  let  r,,  =  |-^  =  1.68,  which  approximates  the 
value  ra  of  the  system  matrix  A3  [9]  whose  dimension  is 
4160  X  3228,  ai  =  0.986703,  and  <73  =  0.587256.  Define 
multiplication  of  A  by  an  image  vector  (of  dimension  n)  as 
a  forward  projection,  and  multiplication  of  A^  by  a  pro¬ 
jection  data  vector  (of  dimension  m)  21s  a  backward  projec¬ 
tion.  The  derivation  of  <ti  and  <72  took  9  and  25  forward 
and  backward  projections,  respectively.  Most  computation 
in  iterative  reconstruction  algorithms  consists  of  these  two 
operations. 

The  computation  for  DC  suppression  consists  of  one  for¬ 
ward  and  bswrkward  projection  (see  the  first  four  blocks  in 
Fig.  2).  A  single  Landweber  iteration  also  requires  one  for¬ 
ward  and  backward  projection.  The  generalized  Landwe¬ 
ber  iteration,  with  D  =  F(aA^A)  as  in  (2),  requires  7 
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Figure  2:  The  flowchart  of  the  DC-suppression  procedure. 


forward  and  backward  projections  per  iteration  [5]. 

Fig.  3  shows  the  number  of  forward  and  backward  pro¬ 
jections  needed  to  recover  95%  of  image  components  on 
various  singular  vectors  when  To  =  1.68  for  the  Landwe- 
ber  and  generalized  Landweber  iterations  both  with  and 
without  DC-suppression.  Without  loss  of  generality,  the 
largest  singular  value  is  assumed  to  be  1.  Fig.  3  shows  that 
after  about  22  forward  and  backward  projections  (1  from 
the  DC  suppression  and  21  from  3  generalized  Landweber 
iterations):  1)  the  generalized  Landweber  iteration  recov¬ 
ers  more  components  than  the  Landweber  iteration;  2)  the 
(generalized)  Landweber  iteration  with  DC  suppression  re¬ 
covers  more  high-frequency  components  than  the  (gener¬ 
alized)  Landweber  iteration  alone;  smd  3)  the  generalized 
Landweber  iteration  with  DC  suppression  recovers  all  im¬ 
age  components  on  singular  vectors  with  singular  values 
greater  than  0.1;  to  achieve  this  without  DC  suppression, 
63  forward  and  backward  projections  are  necessary.  Hence 
the  Landweber  and  generalized  Landweber  iterations  with¬ 
out  DC  suppression  require  roughly  three  times  as  many 


normalized  singular  value 


Figure  3:  The  number  of  forward  and  backward  projections 
needed  for  the  recovery  of  95%  of  components  on  various 
singular  vectors.  Without  loss  of  generality,  the  largest 
singular  value  is  assumed  to  be  1.  The  symbol  DC  after 
Landweber  means  Landweber  with  DC  suppression;  the 
G-Landweber  means  generalized  Landweber.  Note  that 
r„  =  ^  =  1.68. 


projections  to  reach  the  same  convergence  point  as  the  it¬ 
erations  with  DC  suppression.  This  factor-of-three  speed¬ 
up,  together  with  the  possibility  of  parallel  implementa¬ 
tion,  suggests  greater  application  of  these  algorithms  in 
the  future. 


IV.  Simulation  results 

The  Landweber- type  algorithms,  ART  and  CG,  all  with 
and  without  the  DC-suppression  procedure,  were  com¬ 
pared  in  the  reconstruction  of  the  image  of  the  complex 
phantom  (see  Fig.  4)  in  the  PET  system  with  system  ma¬ 
trix  As  [5].  The  ART  and  MLEM  iterations  both  with 
average  initial  condition  ^  (each  initial  image  pixel  is  set 
to  be  the  total  value  of  b  divided  by  the  total  number  of 
image  pixels)  were  also  included  for  comparison. 

Fig.  5  shows  the  results.  Each  symbol  in  a  curve  rep¬ 
resents  the  result  after  a  single  iteration.  The  comparison 
was  based  on  how  close  the  solution  to  the  true  image  vs. 
the  number  of  forward  and  backward  projections  spent.  It 
is  clear  that  the  Landweber-type  algorithms  were  all  accel¬ 
erated  using  the  DC-suppression  procedure  (see  Fig.  5(a) 
and  (c)).  Fig.  6  explains  that  the  acceleration  of  STP  with 
the  DC-suppression  procedure  was  directly  related  to  the 
larger  geiins  caused  by  the  DC  suppression  during  the  first 
12  forward  and  backweurd  projections.  ART  with  the  DC- 
suppression  procedure  was  faster  than  ART  without  the 
DC-suppression  but  was  comparable  to  ART  with  average 
initial  condition  (see  Fig.  5(b)).  This  indicates  that  ART 
can  be  significantly  accelerated  by  simply  using  the  aver¬ 
ts  common  initizJ  condition  setting  in  emission  tomography. 
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(C) 


(d) 


Figure  5:  The  performance  of  reconstructing  the  image  of  the  complex  phantom  using  (a)  the  Landweber  iteration,  the 
generalized  Landweber  iteration,  and  their  DC  suppression  implementations;  (b)  ART  iterations  with  zero,  average, 
and  DC  suppression  implementation;  (c)  STP  iteration,  CG  iteration,  and  their  DC  suppression  implementations;  and 
(d)  the  fastest  methods  in  the  previous  three  figures  (one  from  each  figure)  and  MLEM  iteration. 


Figure  4:  The  image  of  the  complex  phantom. 


age  initial  condition  for  the  reconstruction  of  the  complex 
phantom  whose  values  are  non-negative.  In  this  example, 
ART  can  be  accelerated  about  three  times  by  simply  us¬ 
ing  average  initial  condition.  CG  could  not  be  accelerated 
(see  Fig.  5(c)).  The  generalized  Landweber  iteration  was 
faster  than  CG  and  was  comparable  to  ART,  all  with  the 
DC-suppression  procedure  (see  Fig.  5(d)). 


V.  Summary 

A  new  DC-suppression  procedure  has  been  developed, 
shown  to  significantly  accelerate  the  convergence  of  the  , 
Landweber-type  iterations.  The  procedure  identifies  aocT,^ 
removes  the  DC  component  from  the  iteration,  permitting'^ 
larger  gsuns  at  and  accelerating  convergence.  The  conver-, 
gence  speed-up  is  due  to  the  large  gap  between  the  two  . 
largest  singular  values  of  the  system  matrix;  this  gap  was 
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Figure  6:  The  change  of  the  gain  factors  in  STP  iteration 
after  DC  suppression. 

observed  in  PET  system  matrices  of  different  scales,  and 
is  likely  to  be  present  in  any  problem  with  a  non-negative 
system  matrix. 

The  generalized  Landweber  iteration  with  DC  suppres¬ 
sion  recovered  95%  of  the  image  components  on  singular 
vectors  with  singular  values  greater  than  0.1  in  about  22 
forward  and  backward  projections  when  the  ratio  r<,  = 
1.68;  without  DC  suppression  63  forward  and  backward 
projections  were  necessary.  Although  ART  with  DC  sup¬ 
pression  showed  faster  convergence  than  ART  with  a  zero 
initial  condition,  this  was  not  a  significant  improvement 
over  ART  with  an  average  initial  condition.  With  DC  sup¬ 
pression,  generalized  Landweber  and  ART  converge  faster 
than  CG  and  MLEM  (DC  suppression  can  not  be  used  on 
MLEM)  during  the  initial  stage  of  the  iteration  (the  first 
100  forward  and  backward  projections).  The  generalized 
Landweber  iteration  with  DC  suppression  was  compara¬ 
ble  to  ART  with  DC  suppression.  However,  considering 
the  feasibility  of  parallel  computing  of  the  forward  and  the 
backward  projections,  and  the  controllability  of  the  gen¬ 
eralized  Landweber  iteration,  the  generalized  Landweber 
iteration  may  be  more  favorable  compared  to  ART. 

Topics  for  further  research  include  further  study  of  the 
ratio  for  different  system  matrices,  and  investigation  of 
the  possibility  of  using  different  shaping  matrices  to  speed 
up  the  generalized  Landweber  iteration  with  DC  suppres¬ 
sion. 
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APPENDIX  HI 

T.-S.  Pan  and  A.E.  Yagle,  “Numerical  Study  of  Multigrid  Implementations 
of  Some  Iterative  Image  Reconstruction  Algorithms,”  IEEE  Trans.  Medical 
Imaging  10(4),  572-588,  December  1991. 

This  paper  investigates  the  use  of  several  iterative  algorithms,  including  the  gener¬ 
alized  Landweber  iteration,  in  multigrid  image  reconstruction.  The  image  is  first  recon¬ 
structed  quickly  on  a  coarse  grid.  This  coarse  image  is  then  used  as  the  initialization  for 
reconstruction  of  the  image  on  a  fine  grid.  Many  numerical  examples  are  used  to  illustrate 
the  performance  of  various  algorithms. 
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Abstract — The  numerical  behavior  of  multigrid  implementa¬ 
tions  of  the  Landweber,  generalized  Landweber,  ART,  and 
MLEM  iterative  image  reconstruction  algorithms  is  investi¬ 
gated.  Comparisons  between  these  algorithms,  and  with  their 
single-grid  implementations,  are  made  on  two  small-scale  syn¬ 
thetic  PET  systems,  for  phantom  objects  exhibiting  different 
characteristics,  and  on  one  full-scale  synthetic  system,  for  a 
Shepp- Logan  phantom.  We  also  show  analytically  the  effects  of 
noise  and  initial  condition  on  the  generalized  Landweber  iter¬ 
ation,  and  note  how  to  choose  the  shaping  operator  to  filter  out 
noise  in  the  data,  or  to  enhance  featnres  of  interest  in  the  re¬ 
constructed  image.  Original  contributions  include  1)  numerical 
studies  of  the  convergence  rates  of  single-grid  and  multigrid 
implementations  of  the  Landweber,  generalized  Landweber, 
ART,  and  MLEM  iterations  and  2)  effects  of  noise  and  initial 
condition  on  the  generalized  Landweber  iteration,  with  proce¬ 
dures  for  filtering  out  noise  or  enhancing  image  features. 


I.  Introduction 

E'ERATFVE  image  reconstruction  algorithms  have  been 
tudied  intensively  in  the  last  decade,  for  application  in 
single  photon  emission  computerized  tomography 
(SPEC!)  and  positron  emission  tomography  (PET)  [1]- 
[9],  Some  believe  [10],  [11]  that  an  iterative  reconstruc¬ 
tion  algorithm  produces  a  less  noisy  image  than  filtered 
back-projection  (EBP)  [12],  Some  advantages  of  using  an 
iterative  algorithm  are  1)  iterative  algorithms  can  still  be 
applied  when  complete  projection  data  are  unavailable,  2) 
there  is  considerable  control  over  the  reconstruction  pro¬ 
cess,  and  3)  spatially  varying  attenuation  corrections  can 
be  incorporated  [10],  [11],  Some  possible  drawbacks  of 
iterative  algorithms  are;  1)  a  large  amount  of  computation 
and  time  required;  and  2)  the  need  to  incorporate  regular¬ 
ization  [13]-[15]  to  make  the  iteration  numerically  stable 
and  insensitive  to  noise. 

Most  iterative  algorithms  exhibit  a  nonuniform  conver¬ 
gence  property  [3],  [15]:  low-frequency  components  of  the 
image  tend  to  be  recovered  earlier  in  the  iteration  than 
high-frequency  components.  Here  high-frequency  com¬ 
ponents  are  those  associated  with  small  singular-values, 
and  low-frequency  components  are  associated  with  large 
singular-values  of  the  system  matrix.  This  terminology  is 
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common  (e.g.,  [15]),  although  these  frequencies  coincide 
with  Fourier  frequencies  only  in  a  special  circumstance 
[19].  The  convergence  rate  of  the  algonthm  is  thus  lim¬ 
ited  by  the  relatively  slow  convergence  of  the  high-fre¬ 
quency  components  of  the  image.  Algorithms  such  as  the 
Landweber  iteration  [16],  algebraic  reconstruction  tech¬ 
nique  (ART)  [2],  and  maximum-likelihood-expectation- 
maximization  (MLEM)  [5],  allow  little  control  over  their 
convergence  behavior. 

The  generalized  Landweber  iteration  of  Strand  [17],  in 
contrast,  permits  considerable  control  over  its  conver¬ 
gence  rate  by  the  choice  of  the  shaping  operator.  Al¬ 
though  it  has  not  been  generally  used  in  PET  imaging, 
this  algorithm  has  several  advantages:  1)  it  is  paralleliz- 
able  (unlike  ART),  permitting  fast  implementations,  e.g., 
a  forward  or  backward  projection  can  be  computed  by 
multiple  processors  at  the  same  time,  2)  its  convergence 
behavior  can  be  specified  using  the  shaping  operator,  and 
3)  the  shaping  operator  can  be  used  to  filter  out  noise  in 
the  data  or  to  enhance  features  in  the  image.  These  ad¬ 
vantages  will  be  discussed  in  Section  III,  along  with  de¬ 
tails  on  the  effects  of  noise  and  initial  condition  on  the 
convergence  rate  of  the  iteration.  In  particular,  we  treat 
the  data  and  noise  separately,  and  reveal  an  important  dis¬ 
tinction  between  the  effects  of  data  and  noise  on  the  re¬ 
constructed  image  as  the  iteration  progresses. 

Initialization  of  an  iterative  algorithm  has  an  important 
effect  on  its  convergence.  Initialization  can  be  as  simple 
as  setting  all  pixels  to  zero,  or  as  complex  as  using  the 
result  of  EBP  [1],  [14].  The  presumption  behind  the  latter 
approach  is  that  EBP  should  furnish  a  starting  point  that 
is  “close”  to  the  desirable  image,  after  which  the  itera¬ 
tion  would  quickly  converge.  Surprisingly,  this  approach 
has  shown  little  success  [14].  In  Section  V,  our  numerical 
results  lead  to  a  possible  explanation  of  why  this  is  so. 
We  also  examine  the  effects  of  different  initid  conditions 
on  the  convergence  behavior  of  different  iterative  algo¬ 
rithms. 

The  most  important  new  contribution  of  this  paper  is  a 
numerical  study  of  nmltigrid  implementations  of  various 
iterative  algorithms:  Landweber,  generalized  Landweber, 
ART,  and  MLEM.  In  a  multigrid  implementation,  the  it¬ 
eration  is  first  used  on  a  coarse  grid  until  it  converges;  the 
result  is  then  interpolated  and  used  as  an  initial  condition 
on  a  fine  grid.  The  coarse  grid  iteration  requires  much  less 
computation  per  iteration.  Although  two  Afferent  system 
geometry  matrices  are  needed  (one  for  each  grid),  both 
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matrices  are  relatively  sparse,  so  the  additional  storage 
required  for  the  (much  smaller)  coarse-grid  system  matrix 
is  minor.  More  details  are  given  in  Section  IV. 

We  define  a  local  smoothness  property:  the  values  of 
four  neighboring  fine-grid  pixels  to  be  grouped  as  a 
coarse-grid  pixel  are  close  to  each  other.  Our  results  in¬ 
dicate  that  if  the  image  has  this  property,  then  the  con¬ 
vergence  of  high-frequency  components  of  the  image  can 
be  significantly  accelerated  using  a  multigrid  implemen¬ 
tation.  “Local  smoothness”  is  not  strictly  defined  here, 
nor  does  it  need  to— the  more  this  property  holds,  the 
greater  the  acceleration  of  convergence.  If  the  local 
smoothness  property  does  not  hold,  or  if  there  are  no  high- 
frequency  components  in  the  image,  then  a  multigrid  im¬ 
plementation  does  not  seem  to  speed  up  the  convergence 
rate.  Ranganath  et  al.  [22]  proposed  a  multigrid  imple¬ 
mentation  of  the  MLEM  algorithm,  and  gave  a  numerical 
example.  In  the  present  paper,  results  for  several  different 
algorithms  are  given,  more  examples  are  given,  and  more 
conclusions  about  reconstruction  behavior  are  made. 

Our  numerical  experiments  were  mainly  conducted  on 
two  small-scale  synthetic  PET  systems  (see  Section  V  for 
details).  The  systems  are  large  enough  to  exhibit  behavior 
that  would  be  seen  in  real  PET  systems,  but  small  enough 
to  study  in  detail  aspects  of  the  image  and  the  system 
(e.g.,  singular-values)  that  cause  various  types  of  conver¬ 
gence  behavior.  A  numerical  example  of  a  large-scale 
synthetic  PET  system  with  a  Shepp- Logan  phantom  [12] 
is  also  included.  Many  numerical  experiments  were  per¬ 
formed;  the  examples  given  here  are  intended  to  be  illus¬ 
trative,  not  comprehensive,  and  our  conclusions  are  not 
based  just  on  the  examples  presented,  but  on  many  others 
as  well. 

The  paper  is  organized  as  follows.  Section  II  formu¬ 
lates  the  image  reconstruction  problem  as  the  solution  to 
a  large  system  of  linear  equations,  and  gives  a  review  of 
singular  value  decomposition  (SVD).  Section  III  studies 
analytically  and  in  detail  the  convergence  behavior  of  the 
Landweber  and  generalized  Landweber  iterations.  The  dif¬ 
ferent  roles  of  data,  noise,  and  initial  condition  on  the 
convergence  rate  are  studied  for  the  first  time,  and  pro¬ 
cedures  for  filtering  out  noise  and  enhancing  image  fea¬ 
tures  are  discussed.  Section  IV  summarizes  the  idea  of  a 
multigrid  implementation  and  presents  the  multigrid  im¬ 
plementation  adopted  in  this  work.  Section  V  presents, 
summarizes,  and  discusses  numerical  results,  and  pre¬ 
sents  some  conclusions  about  convergence  rates  of  mul¬ 
tigrid  implementations.  Section  VI  concludes  with  a  sum¬ 
mary. 


n.  Problem  Formulation 
A.  System  Equation 

The  system  equation  [2],  [S]  that  describes  the  trans¬ 
formation  or  projection  process  in  SPECT  or  PET  is  usu¬ 
ally  represented  as 


where  A  is  an  m  x  «  system  matrix,  which  describes  the 
system  geometry,  x  is  ann  x  1  vector  of  the  image  pixels, 
and  h  is  an  m  X  1  vector  of  the  measured  projection  of 
the  image.  The  matrix  A  is  typically  sparse  [6]. 

There  are  several  problems  associated  with  the  solution 
of(l). 

1)  The  size  of  the  system— typically  m  and  n  are  in  the 
order  of  thousands  [5],  [6]. 

2)  Ill-conditioning  of  the  system— typically  A  has  a 
large  condition  number  [15],  so  that  a  small  change  in  the 
projection  data  b  may  cause  a  large  change  in  the  solution 
X  or  in  the  minimum-norm  least-squares  solution  x*, 

X*  =  A^b  (2) 

where  A^  is  the  pseudoinverse  of  A  [24]. 

3)  ni-posedness— typically,  A  may  have  a  nonempty 
null  space,  so  that  some  components  of  the  image  can  not 
be  recovered  from  the  projection  b  without  additional  in¬ 
formation. 


B.  Review  of  Singular  Value  Decomposition 

Let  the  singular  value  decomposition  (SVD)  [24]  of  the 
real  m  X  n  system  matrix  A  be 

A  =  ULV^  (3) 

where  U  =  [«,,  «2,  •  •  •  ,  uj  and  K  =  [y,,  Vi,  ’  ■  •  ,  v„] 
are  orthogonal  matrices.  The  vectors  u,  and  y,  are  singu¬ 
lar-vectors,  and  U  and  V  represent  a  set  of  orthonormal 
bases  for  the  real  Hilbert  spaces  (R"  and  (R",  respectively. 
The  matrix  E  is  m  x  n  and  “diagonal”  in  that  <  E  >,  ^  = 
0  unless  /  =  j,  in  which  case  <  E  =  aj,  the  singular- 
values  of  A.  The  number  of  nonzero  singular- values  is  the 
rank  p{A)  of  A  [13],  [17].  Without  loss  of  generality,  as¬ 
sume  the  maximum  singular-value  of  A  is  one  (this  can 
always  be  done  by  scaling  A)  and  let  1  =  u,  >  02  •  •  • 
>  Op(A)  >  0  =  + 1  =  •  •  •  =  where  min  {m, 

/i}  is  the  minimum  of  m  and  n.  Also  define  + 1  = 

•  •  •  =  gm..  .)  =  0  where  max  (m,  n}  is  the  maximum 
of  m  and  n. 

It  can  be  shown  [13],  [17]  that 


{AA^)Ui  =  ajui. 

j  =  l,--- 

,  m. 

(4) 

{A^A)Vi  =  a]  Vi, 

i=l,--- 

,  n. 

(5) 

A^Ui  =  g,y„ 

j  =  l,--- 

,  m. 

(6) 

AVi  =  a,Ui, 

1  =  1,... 

,  n. 

(7) 

so  that  the  squares  oj  of  the  singular-values  are  the  eigen¬ 
values  of  the  matrices  AA^  and  A^A.  It  can  also  be  shown 
[24]  that  the  minimum-nonn  least-squares  solution  to  (1) 
is 


jf*  =  K  diag 


_1_ 

gi 


0 


lAA) 


=  S  -  (h,  Ui)Vi 
1-1  g, 


Ax  =  b 


(1) 


(8) 


J  OECEVIBER 


i  KAiNcAc  1  IONS  UN  MEDICAL  imaging.  VOL  10  so 


where  ib,  u,)  —  b^u,,  the  inner  product  of  vectors  b  and 

u,. 

Let  R(A)  and  i\(A)  be  the  range  and  the  null  space  of 
A,  respectively,  and  let  R(A^)  and  N(A^)  be  the  range  and 
the  null  space  of  A^.  respectively.  It  is  also  true  [25]  that 


RiA)  =  span  {«;.  •  • 

(9) 

.V(A^)  =  span 

1.  ■  •  •  . 

(10) 

R(A^)  =  span  {r,.  ■  • 

(11) 

.V(/l)  =  span 

1-  •  •  •  ,  l^n}. 

(12) 

R{A)  0  N(/l^),  (R" 

=  R(A^)  ©  N(A) 

(13) 

where  0  means  direct  sum.  In  fact,  N(A^)  =  R(A)"  and 
R(A^)  =  N(A)'^  where  and  N{A)^  are  the  orthog¬ 

onal  complements  of  /?(/4)  and  N{A),  respectively. 

III.  Nonuniform  Convergence  of  the  Landweber 
AND  Generalized  Landweber  Iterations 

In  this  section,  we  review  the  Landweber  and  general¬ 
ized  Landweber  iterations,  and  the  nonuniform  conver¬ 
gence  behavior  of  these  algorithms;  low-frequency  com¬ 
ponents  of  the  image  are  recovered  earlier  in  the  iteration 
than  high  frequency  components.  We  also  make  some  new 
points  about  the  effects  of  noise  and  initial  condition  on 
the  convergence  behavior  of  these  algorithms.  These 
points  will  be  important  in  interpreting  the  numerical  re¬ 
sults  in  Section  V. 

A.  Review  of  the  Landweber  and  Generalized 
Landweber  Iterations 

The  Landweber  iteration  method  [16]  was  proposed  in 
1951.  This  iteration  is 

/  =  -F  A^ib  -  (14) 

where  x*,  in  our  context,  denotes  the  reconstructed  image 
after  the  ^th  iteration,  and  the  superscript  T  means  trans¬ 
pose.  This  iteration  will  converge  to  the  minimum-norm 
least-squares  solution  x*  if  <  2  and  the  iteration 

is  initialized  with  =  0  [17],  [18].  A  formula  showing 
the  performance  after  each  iteration  will  be  derived  for 
this  iteration  in  Section  UI-B. 

The  Landweber  iteration  method  was  extended  by 
Strand  [17]  to  the  generalized  Landweber  iteration 
method,  which  uses 

=  x^~'  +  DA^ib  -  A^~')  (15) 

for  the  iteration.  The  matrix  D,  called  the  shaping  matrix, 
is  a  linear  operator  and  can  be  designed  as  a  polynomial 
function  ol  A^ A  [17]  to  emphasize  some  frequency  com¬ 
ponents  of  an  image  and  to  accelerate  the  convergence  of 
these  components.  In  this  paper,  we  also  assume  the  ma¬ 
trix  D  is  a  polynomial  function  of  A^A.UD  =  I,  then  the 
generalized  Landweber  iteration  in  (15)  reduces  to  the 
Landweber  iteration  in  (14).  The  convergence  of  the  it¬ 


eration  in  (15)  is  assumed  when  ||D/1^.4!L  <  2  [17],  It 
will  be  shown  in  Section  IIl-C  that  use  of  a  proper  D  ma¬ 
trix  [17]  results  in  the  generalized  Landweber  iteration 
converging  faster  than  the  Landweber  iteration  Note  that 
the  generalized  Landweber  iteration  is  similar  to  the  Jans- 
son  Van  Cittert  (JVC)  iteration  [20]  ,[21].  Differences  be¬ 
tween  the  algonthms  are  that  the  generalized  Landweber 
iteration  involves  matnces.  and  D  is  restricted  to  be  a  lin¬ 
ear  operator.  We  do  not  consider  nonlinear  D  operators  in 
this  paper. 

The  major  advantage  in  using  the  generalized  Land¬ 
weber  iteration  in  image  reconstruction  is  that  matnx  D 
behaves  as  a  filtering  operator,  so  that  the  generalized 
Landweber  iteration  can  selectively  reconstruct  or  empha¬ 
size  some  frequency  band  of  interest  in  an  image,  while 
attenuating  other  frequencies  in  the  same  image.  The  ma¬ 
jor  disadvantage  is  the  extra  multiplication  introduced  by 
D,  which  depends  proportionally  on  the  degree  of  the 
polynomial  function  being  used.  However,  the  general¬ 
ized  Landweber  iteration  is  parallelizable,  unlike  ART; 
this  helps  compensate  for  the  extra  computation. 

B.  Analysis  of  Convergence  of  the  Landweber  Iteration 

From  (1),  any  component  of  the  image  x  in  the  null 
space  N(A)  has  no  contribution  to  the  projection  b.  In  the 
(generalized)  Landweber  iteration  with  zero  initial  con¬ 
dition  [see  (14)  and  (15),  and  see  (19)  and  (27)  later],  no 
component  of  the  image  x  in  N(/4)  can  be  recovered 
through  the  iteration,  which  is  dependent  on  b,  and  any 
component  of  the  projection  b  not  in  R(/l)  has  no  effect  in 
the  iteration. 

Now  suppose  the  projection  data  are  noisy,  so  that 

b  =  bo  +  €  (16) 

where  bo  £  /?(/!)  is  the  actual  value  of  the  projection  data, 
and  the  noise  e  models  measurement  noise  and  Poisson 
counting  noise.  The  noise  e  and  initial  condition  Xq  can  be 
further  decomposed  as 

f  —  to  to” .  (17) 

xo  =  +  Xo^  (18) 

where  eo  e  /?(/!),  ef  e  R(A)  ,  jfo  e  R(A^)  and xf  e  R(A^)  ^ 
=  IV(A).  Notice  that  in  (17)  only  eq  can  affect  the  mea¬ 
surement  of  bo,  and  only  eo  may  have  an  adverse  effect  in 
the  iterations  (14)  and  (15)  since  A^ef  =  0. 

We  now  make  some  new  observations  about  the  effects 
of  noise  and  initial  condition  on  the  convergence  behav¬ 
ior.  By  expressing  the  reconstructed  image  x^  in  (14)  as  a 
function  of  the  projection  b  and  initial  condition  Xq  =  Xo 
+  Xq  as  in  (18),  and  using  (3),  we  can  show 

otA)  . 

x^  =  xo^  +  Z  [1  -  (1  -  0?)*]  -  {bo,  u,)i;, 

<-i  0/ 

+  Z  [1  -  (1  -  0?)*]  -  (fo,  Ui)Vi 

I  - 1  0,- 

P(A) 

-I-  Z  (1  -  ajfixo,  Vi)Vi.  (19) 

I  =  I 
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Equation  (19)  plays  an  important  role  in  exploring  how 
the  singular-values  ct,  and  the  number  of  iterations  k  affect 
the  relationship  between  the  reconstructed  image  x*  and 
the  projection  the  noise  Fq.  and  the  initial  condition  xq. 

In  ( 19)  let  —  CD.  and  in  (8)  let  ^  =  ^>0  f  as  in  (16). 
Companng  these,  the  only  difference  is  xq  .  the  compo¬ 
nent  of  the  initial  condition  .rg  in  the  null  space  N(A).  De¬ 
fine 

G,(cr,,  A:)  s  [1  -  (1  -  a;)*],  (20) 

A:)  s  [1  -  (1  -  gf)*]  -i,  (21) 

<^1 

C,„(a,  A:)  =  (1  -  (22) 

Then  Gr(a,  ,  k),  k),  and  Gj„((t,  ,  k)  represent  the 

component  gains  of  data,  noise,  and  initial  condition,  re¬ 
spectively,  after  the  first  k  iterations,  on  the  singular-vec¬ 
tor  Vi  associated  with  singular-value  a, . 

Note  that  the  data  gain  G^(a, ,  k)  does  not  include  the 
factor  1/a,  ,  while  the  noise  gain  G„(ai,  k)  does.  These 
gain  definitions  are  made  for  the  following  reasons.  The 
noise  Eq,  causing  the  reconstructed  image  x*  in  the  Land- 
weber  iteration  to  differ  from  the  noiseless  minimum-norm 
least-squares  solution  of  Ax  =  bo,  mainly  comes  from 
measurement  or  observation  noise,  not  from  Poisson 
counting  noise.  This  noise  does  not  arise  in  the  projection 
process,  which  is  low-pass  in  nature  [26];  it  is  added  af¬ 
terwards,  so  that  the  factor  1  /a,  is  lumped  with  (eq,  «,) 
in  the  noise  gain  G^(<Ji ,  k)  (Poisson  counting  noise,  which 
does  come  from  the  projection  process,  is  assumed  to  have 
a  limited  effect  on  the  reconstructed  image  x*).  On  the 
other  hand,  since  the  projection  data  bo  does  come  from 
the  projection  process,  the  data  gain  G((a,  ,  k)  should  not 
include  1  / a,  as  we  compare  (8)  and  ( 19). 

Equation  (19)  illustrates  several  points. 

1)  The  projection  bo,  noise  eq,  and  initial  condition  Xq 
can  be  decomposed  into  components  each  having  an  in¬ 
dependent  contribution  to  and  a  different  convergence  rate 
in  the  resulting  reconstruction. 

2)  The  data  gain  G^(a,  k)  has  the  property 

lim  Gfa,  k)  =  I,  0  <  a  5  1,  (23) 

i  T*  00 

so  that  all  components  of  the  image  not  in  NiA)  will  even¬ 
tually  be  recovered. 

3)  However,  the  noise  gain  G^(a,  k)  will  also  become 
larger  and  larger  as  the  iteration  progresses  (i.e.,  as  the 
index  k  increases): 

lim  G^(a,  A:)  =  -,  0  <  a  <  1.  (24) 

it  00  0 

Comparing  (23)  and  (24)  and  noting  that  A  usually  has 
some  very  small  singular-values,  the  noise  will  be  greatly 
amplified  by  the  time  the  image  components  in  R(A^)  are 
completely  recovered  since  1/a,  »  1  for  small  a,  .  This 
is  due  to  the  ill-conditioning  of  the  system.  A  noise  com¬ 
ponent  on  the  singular-vector  V;  will  be  amplified  by  a 
factor  proportional  to  1  /  a, . 


4)  The  initial  condition  gain  G,g(a,  k)  converges  to 
zero 

lim  G^o(a,  k)  =  0,  0  <  a  <  1  (25) 

—  00 

so  that  the  initial  condition  component  ,fo  will  become  less 
and  less  important  as  the  iteration  progresses. 

5)  However,  the  vector  xq  produces  a  term  indepen¬ 
dent  of  the  iteration.  Hence,  a  poor  initial  condition  can 
impose  a  bias  on  the  reconstruction  which  can  not  be  cor¬ 
rected  by  the  iteration. 

Fig.  1  shows  the  convergence  properties  of  the  data  gain 
Gt(a,  k),  noise  gain  G„(a,  k)  and  initial  condition  gain 
Gxoitt,  A:)  for  the  first  30  Landweber  iterations.  In  Fig.  1, 
the  data  components  on  singular-vectors  with  larger  sin¬ 
gular-values  are  recovered  faster,  and  components  on  sin¬ 
gular-vectors  with  smaller  singular-values  are  recovered 
relatively  slower.  This  demonstrates  the  nonuniform  con¬ 
vergence  property:  the  reconstruction  of  high-frequency 
components  converges  slowly,  while  the  reconstruction 
of  low-frequency  components  converges  relatively  fast. 
Here  the  "high”  and  “low”  frequencies  of  an  image  are 
defined  as  the  components  on  singular-vectors  with 
"small”  singular-values  and  the  components  on  singular- 
vectors  with  "large”  singular- values,  respectively. 

C.  Analysis  of  Convergence  of  the  Generalized 
Landweber  Iteration 

From  the  generalized  Landweber  iteration  in  (15),  it 
can  be  seen  that  D  is  an  operator  mapping  /?(/l^)  to  (R". 
Since  /?(/l^)  is  spanned  by  {t;,,  •  •  •  ,  as  in  (11), 
the  definition  of  D  will  be  complete  if  the  image  under 
Dvi  of  each  singular-vector  t;,  in  /?(/l^)  is  specified  [17]. 
Strand  [17]  proposed  that  a  matrix  D  could  be  designed 
by  specifying  the  scalars  p,,  •  •  •  ,  in 

DVi  =  PiVi,  0  <  pio}  <  2  (26) 

where  the  condition  0  <  p,  af  <  2  assures  the  conver¬ 
gence  of  the  generalized  Landweber  iteration.  Repeating 
the  procedure  for  deriving  (19),  the  reconstructed  image 
X*  after  the  ^th  generalized  Landweber  iteration  will  be 

piA)  j 

X*  =  Xo^  +  S  [1  -  (1  -  Pioh']  -  (bo,  Ui)Vi 
i  =  1  a, 

piA)  . 

-i-  2  [1  -  (1  -  p,a,?)*]  -  (eq,  Ui)Vi 
i  - 1  a, 


piA) 

+  2  (1  -  Piojfixo,  Vo) Vi.  (27) 

I  *  1 

The  first  sum  appeared  in  [17];  the  additional  sums  rep¬ 
resenting  the  effects  of  noise  and  initial  condition  are  new. 
Define  [compare  to  (20)-(22)] 

G’iOi,  k)  ^  -  (I  -  Pioh^],  (28) 

G;(a,-,A:)  4  [1  -  (1  -p,a?)*]-5-,  (29) 

G'J,ai,k)  k{\  -  pio]f.  (30) 


m 
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Fig.  1.  The  convergence  properties  of  (a)  dau  gain  G,(<r,  i),  (b)  noise  gain 

G„[a,  k).  and  (c)  initial  condition  gain  G„(a.  k)  versus  singular-value  a  for 
the  first  30  Landweber  iterations.  The  first,  second,  and  third  iteration  plus 
the  last  iteration  is  labeled. 

Then  Gi(a, ,  k),  k),  and  G^(a,  ,  k)  represent  the 

component  gains  of  data,  noise,  and  initial  condition,  re¬ 
spectively,  after  the  first  k  iterations  on  the  singular-vec¬ 
tor  Vi  with  singular-value  a, .  The  convergence  properties 
of  G^(a,  k),  G^(a,  k),  and  G^(ff,  k)  are  [compare  to  (23)- 
(25)] 


lim  G;(a,  k)  =  1, 

0  <  <r  s  1, 

(31) 

lim  G;(a,  k)  =  -, 

0  <  ff  iS  1, 

(32) 

i-*oo  (J 

lim  G'^ia,  k)  =  0, 

0  <  (T  <  1, 

(33) 

From  f23)-(25)  and  f31)-(33).  we  see  that  the  asvmp- 
totic  behavior  of  the  Landweber  iteration  is  identical  to 
that  of  the  generalized  Landweber  iteration.  The  only  dif¬ 
ference  between  (19)  and  (27)  is  cr)  m  (19)  has  been 
changed  to  p,cj  in  (27).  When  p,  =  1,  the  iteration  m 
(27)  is  identical  to  that  in  (19).  We  also  see  that  choosing 
Pi  can  modify  the  convergence  rates  of  the  data  gain  G[{a, . 
k),  noise  gain  G;„(ct,  .  k).  and  initial  condition  gain  G;„(cr, , 
k).  However,  the  convergence  rates  of  G;(a, .  k).  G',„{a.. 
k)  and  G^„(a, ,  k)  will  all  be  accelerated  or  be  decelerated 
at  the  same  time. 

We  now  demonstrate  the  convergence  behavior  of  the 
generalized  Landweber  iteration  for  two  different  choices 
of  the  shaping  matrix  D.  First,  let  D  be  specified  by 
choosing  p,  =  1/ a" ,  tj,  ^  0  in  (27).  Then  the  limiting 
behavior  described  in  (31)-(33)  is  attained  after  a  single 
iteration!  Of  course,  (27)  along  with  zero  initial  condition 
and  p,  =  I  jaj  make  it  clear  that  this  choice  of  D  is  equiv¬ 
alent  to  the  direct  computation  of  the  minimum-norm 
least-squares  solution  x*  in  (8). 

The  second  choice  of  D  is  from  [17],  in  which  D  is 
given  as  a  polynomial  function  f(  • )  of  A^A,  i.e..  D  =  - 
F(A^A)  where 

F(X)  =  31.5  -  315X  -h  1443.75X^  -  3465X^ 

-I-  4504. 5X‘‘  -  3003X^  804.375X^  (34) 

The  polynomial  function  F(X)  is  chosen  so  that  XF(X)  is 
a  good  approximation  to  the  unit  step  function.  This 
choice  results  [see  (5)  and  (26)]  Dvi  =  F{A^A)Vi  = 
F{a] )  Vi  and  F{a] )  =  Pi,  so  that  (34)  effectively  chooses 
PiOi  »  1  to  achieve  the  minimum-norm  least-squares  so¬ 
lution  faster  [see  (27)].  Hereafter,  we  will  assume  this 
polynomial  function  is  used  for  the  D  matrix  in  the  gen¬ 
eralized  Landweber  iteration.  From  this  example,  we  can 
see  that  it  is  not  necessary  to  carry  out  an  SVD  compu¬ 
tation  in  order  to  design  the  shaping  matrix  D. 

Fig.  2  shows  the  convergence  properties  of  the  data  gain 
G'Ao,  k),  noise  gain  G’„{a,  k)  and  initial  condition  gain 
k)  for  the  first  30  generalized  Landweber  iterations 
using  (34).  Comparing  the  convergence  patterns  in  Figs. 

1  and  2,  we  see  that  using  (34)  in  the  generalized  Land¬ 
weber  iteration  accelerates  the  convergence  rhte  (in  terms 
of  the  number  of  iterations),  and  further  attenuates  the 
effect  of  initial  condition,  but  at  the  same  time  further 
amplifies  the  noise. 

D.  Filtering  with  the  Generalized  Landweber  Iteration 

Equations  (27)-(30)  show  that  the  generalized  Land¬ 
weber  iteration  can  filter  either  the  image  or  the  data,  re¬ 
ducing  the  effects  of  noise  or  enhancing  features  in  the 
image.  To  see  this,  suppose  we  kimw  a  priori  that  the 
noise  cq  lies  (or  is  likely  to  lie)  in  some  subspace  spanned 
by  {«/,  i  e  S}  where  S  C  {1,  •  •  •  ,  p(A)},  and  the  com¬ 
ponent  of  bo  lying  in  this  subspace  is  not  significant.  Then 
we  would  like  to  eliminate  the  terms  in  the  second  sum  of 
(27)  with  i  6  S.  This  can  be  accomplished  by  choosing 
the  Pi  so  that  Pi<j}  »  0  for  i  e  S  and  p,a?  *  1  for  i  i  S. 
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Fig.  2.  The  convergence  propenies  of  (a)  dau  gain  *),  (b)  noise  gain 
G,’,(a,  i),  and  (c)  initial  condition  gain  G^(a.  k)  versus  singular-value  a  for 
the  first  30  generalized  Landweber  iterations.  The  first,  second,  and  third 
Iteration  plus  the  last  iteration  is  labeled. 


Then  .  k)  =  0  for  small  values  of  k  and  for  i  e  S. 
Since  (eo,  u,  )  =  0  for  i  ^  S,  the  second  sum  of  (27)  will 
be  almost  zero  until  many  iterations  have  passed,  by  which 
time  the  rest  of  the  image  has  converged.  The  effect  is  the 
same  as  if  the  noise  had  been  eliminated,  if  the  iteration 
is  stopped  when  the  image  has  converged. 

Similarly,  features  in  the  image  can  be  enhanced  rela¬ 
tive  to  other  features.  To  see  this,  suppose  we  wish  to 
enhance  some  features  of  x  in  the  subspace  spanned  by 
{t;, ,  i  e  S' }  where  S'  C  { 1,  •  •  •  ,  p(A)}.  'Then  we  would 
like  to  enhance  the  terms  in  the  first  sum  of  (27)  with  i  e 
S',  or  to  eliminate  the  terms  with  i  ^  S'.  Choosing  the  p, 
such  that  p,aj  =  1  for  /  e  S'  and  p,a/  =  0  for  i  S' 


accomplishes  this  by  making  the  desired  features  con¬ 
verge  rapidly  and  the  undesired  features  converge  slowK  . 
as  desired. 

£.  Comparison  of  Computational  Requirements 

Direct  computation  of  the  minimum-norm  least-squares 
solution  (8)  may  require  a  lot  of  space  for  storage  of  the 
matnces  U  and  V  and  a  considerable  amount  of  compu¬ 
tation  time,  and  it  may  amplify  noise  to  a  great  extent. 
For  example,  if  the  system  matnx  A  is  4000  x  4000.  and 
assuming  4  bytes  to  store  a  number  in  floating-point  for¬ 
mat,  128  million  (=  4000  x  4000  x  4  x  2)  bytes  will 
be  required  to  store  U  and  V\  As  for  the  computation, 
suppose  that  the  SVD  of  A  is  already  known  after  a  pre¬ 
vious  (lengthy)  computation,  and  that  A  is  97%  sparse 
(3%  of  the  elements  of  A  are  nonzero).  Direct  computa¬ 
tion  of  X*  in  (8)  is  equivalent  to  about  33  (=  2  /0.06) 
Landweber  iterations  if  the  sparse  structure  of  A  is  uti¬ 
lized  in  the  Landweber  iteration. 

In  some  applications,  A  becomes  sparser  as  its  size  is 
increased.  For  example,  a  150  million-elements  system 
matrix  in  [6]  has  about  2  million  nonzero  elements  so  that 
it  is  98.7%  sparse.  The  direct  computation  of  in  (8)  is 
equivalent  to  about  75  (=  2/0.026)  Landweber  itera¬ 
tions.  Hence  even  apan  from  storage  problems  and  com¬ 
puting  the  SVD  of  A,  iterative  algorithms  become  more 
favorable  over  direct  computation  as  the  size  and  sparsity 
of  A  grows. 

The  number  of  forward  and  backward  projections  used 
in  one  algorithm  can  be  treated  as  an  index  indicating  how 
much  time  the  algorithm  needs.  Generally  speaking,  the 
Landweber,  ART  and  MLEM  iterations  all  require  one 
forward  and  one  backward  projections  for  each  iteration. 
Here  the  operation  of  multiplying  A  by  a  vector  is  called 
a  "forward  projection"  and  multiplying  A^  by  a  vector  is 
a  "backward  projection.” 

To  determine  the  computational  load  of  the  generalized 
Landweber  iteration,  let  F(  \)  =  Oq  +  ^i  A  +  Qt 
so  that  D  =  Oq/  +  QxA^A  +  a2{A^A)^  +  a-^iA^ Af  where 
oq,  ,02  are  scalars  and  coefficients  of  the  polynomial 
function.  Then  the  generalized  Landweber  iteration  can 
be  written  as 

/  =  /  ■'  -F  A^Ai.A^Aia^A^Ar'''' 

4- a,r*'')  +  a,r*~')  +  aor*"‘  (35) 

where  r*”'  =  A\b  -  Ax/‘~^).  Equation  (35)  requires  4 
forward  and  4  backward  projections.  Generalizing,  it  is 
clear  that  using  a  pih  order  polynomial  function  requires 
p  +  1  forward  and  p  +  1  backward  projections,  so  that 
one  generalized  Landweber  iteration  is  approximately 
equal  to  p  -f  1  Landweber,  ART,  or  MLEM  iterations  in 
terms  of  computation  time. 

Refer  to  the  data  gains  GfCj ,  k)  and  Gj(ff, ,  k)  in  (20) 
and  (28),  respectively,  to  get  an  idea  of  how  many  itera¬ 
tions  are  needed  to  recover  one  image  component  on  a 
particular  singular-vector.  Table  I  shows  the  number  of 
iterations  needed  to  recover  95%  of  a  component  on  dif- 
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table  I 

The  Number  of  Iterations  Needed  to  Recover 
95%  Of  A  Component  ON  Different  Singular- 
Vectors 


a 

Landweber 

Generalized  Landweber 

0.9 

2 

T 

0.7 

5 

V 

0  5 

1 1 

■) 

0.3 

32 

T 

0  1 

299 

9 

0.09 

369 

12 

0.07 

610 

19 

0.05 

1  197 

38 

0.03 

3328 

106 

O.OJ 

29951 

951 

0.009 

36982 

1174 

0.007 

61143 

1941 

0.005 

1 19951 

3804 

0.003 

332847 

10567 

0.001 

2956472 

95188 

ferent  singular-vectors  for  the  Landweber  and  generalized 
Landweber  iterations,  computed  from  (20)  and  (28).  The 
polynomial  function  (34)  is  used  here  in  the  generalized 
Landweber  iteration.  Multiplying  the  number  of  iterations 
for  the  generalized  Landweber  iteration  in  Table  I  by  7. 
we  can  see  that  the  components  on  singular-vectors  with 
all  but  the  largest  three  singular-values  still  converge 
faster  in  the  generalized  Landweber  iteration.  It  is  clear 
that  the  convergence  rate  can  be  accelerated  using  the 
generalized  Landweber  iteration,  even  though  each  iter¬ 
ation  takes  more  time  to  compute. 

IV.  The  Multigrid  Method 

The  purpose  of  applying  a  multigrid  method  to  image 
reconstruction  is  to  accelerate  the  iterative  reconstruction 
by  first  using  a  coarse-grid  iteration  to  provide  an  initial 
condition  for  the  more  computation— demanding  fine-grid 
iteration.  The  idea  from  [22]  is  that  low-frequency  com¬ 
ponents  of  an  image  can  be  reconstructed  with  less  effort 
using  the  coarse-grid  iteration;  then  applying  the  fine-grid 
iteration  results  in  an  efficient  reconstruction  of  high-fre¬ 
quency  components. 

Ranganath  et  al.  [22]  used  a  multigrid  method  to  ac¬ 
celerate  the  convergence  rate  of  the  MLEM  algorithm.  It 
was  shown  in  [22]  that  this  multigrid-MLEM  approach 
reduced  the  computation  time  for  reconstructing  a  128  x 
128  noisy  Shepp- Logan  phantom,  using  a  sum  of  the 
squared  errors  criterion.  However,  the  convergence  be¬ 
havior  of  the  multigrid  method  for  image  reconstruction 
in  general,  was  not  considered  in  [22]. 

In  Section  V,  we  will  investigate  the  convergence  prop¬ 
erties  of  the  multigrid  method,  using  an  interpolation 
technique  in  which  each  coarse-grid  pixel  is  evenly  di¬ 
vided  or  distributed  into  its  four  neighboring  fine-grid  pix¬ 
els.  This  interpolation  is  identical  to  the  zero-order  inter¬ 
polation  [27] ,  except  for  a  multiplicative  constant  of  one 
quarter  which  we  include  to  preserve  the  number  of  total 
counts.  The  multigrid  implementations  of  the  Landweber, 


generalized  Landweber,  ART  without  relaxation  [2].  and 
MLEM  [5]  algorithms  are  all  investigated.  Three  noise- 
free  images  with  different  frequency  contents  are  used  m 
this  study. 

Our  multignd  implementation  is  set  up  as  follov.s.  .A 
coarse-gnd  is  defined  where  the  center  of  a  coarse-and 
pixel  IS  located  at  the  center  of  the  four  neighbonna  fine- 
gnd  pixels.  Then  a  coarse-grid  system  matrix  associated 
with  the  coarse-gnd  is  computed  and  stored.  Note  that  the 
set  of  projections  (elements  of  b)  for  the  coarse-gnd  ae- 
ometry  usually  does  not  coincide  with  the  projections  of 
the  fine-grid  geometry. 

In  our  multigrid  method,  the  projection  data  are  first 
denved  from  a  noiseless  projection  of  a  synthetic  imaae 
through  the  fine-grid  system.  A  subset  of  projection  data 
which  corresponds  to  the  set  of  projections  of  the  coarse- 
grid  system  is  chosen  as  the  simulated  projections  from 
the  synthetic  image  for  the  coarse  grid  system.  For  con¬ 
servation  of  total  counts,  the  total  counts  of  the  coarse- 
grid  projection  data  will  be  normalized  such  that  the  total 
counts  of  the  coarse-grid  system  will  be  the  same  as  that 
of  the  fine-grid  system.  Since  we  are  investigating  to  what 
extent  the  multigrid  method  can  contribute  to  the  image 
reconstruction,  we  use  directly  the  minimum-norm  least- 
squares  solution  of  the  coarse-grid  system,  which  is 
equivalent  to  the  convergent  solution  of  the  coarse-grid 
iterations  (Landweber,  generalized  Landweber  and  ART) 
with  zero  initial  condition.  Then  the  interpolation  is  ap¬ 
plied  on  the  coarse-grid  convergent  solution  to  make  a 
fine-grid  image.  A  positivity  constraint  and  another  con¬ 
servation  of  total  counts  are  applied  on  the  fine-grid  image 
to  make  up  a  fine-grid  initial  image.  Finally,  the  fine-grid 
iteration  with  this  fine-grid  initial  condition  is  executed 
1000  times  for  investigation  of  its  convergence  behavior. 

V.  Numerical  Results,  Discussion,  and 
Comparison 

A.  Experiments  on  Small  Scale  Systems 

Two  hypothetical  small-scale  PET  systems  are  used  in 
this  section.  Each  has  a  detector  ring  of  radius  1 ,  and  a 
centered  object  support  of  1.2  x  1.2.  The  detector  rings 
of  PET-1  and  PET-2  are  equally  divided  into  26  and  16 
detectors,  respectively;  it  is  assumed  that  there  is  no  gap 
between  any  two  adjacent  detectors.  For  applying  the 
multigrid  method,  an  object  will  be  pixelated  as  either 
144  (=  12  X  12)  fine-grid  pixels  or  36  (=  6  x  6)  coarse- 
grid  pixels. 

A  tube  is  defined  by  any  two  different  detectors  [5]; 
some  of  the  tubes  defined  may  not  cover  any  pixel  of  the 
image  at  all.  Theoretically,  there  are  325  (=  26  x  25 /2) 
and  120  =  (16  x  15/2)  different  tubes  for  PET-1  and 
PET-2,  respectively.  However,  in  order  to  reduce  com¬ 
putation  time,  the  number  of  different  tubes  or  projections 
is  defined  as  the  number  of  tubes  which  have  nontrivial 
intersections  with  at  least  one  pixel. 

We  assume  the  conditional  probability  P{t/i)  of  an  an¬ 
nihilation  occurring  at  pixel  i  and  its  consequent  coinci- 
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dence  event  being  detected  by  tube  t  is  proportional  to  the 
angle  of  “looking”  from  pixel  i  into  tube  t,  which  is 

,  _  angle  in  radians  from  pixel  /  looking  into  tube  t 


A  normalization  [5]  makes  ,  P(t/i)  =  1  where  t*  is 
the  total  number  of  tubes.  The  system  matrix  A  is  defined 
by  having  (A),  ,  =  Pit / i)  after  normalization. 

Two  fine-grid  system  matrices  A^  (dimension  195  x 
144)  and  Ai  (84  x  144),  both  with  144  pixels,  are  defined 
for  PET-1  and  PET-2,  respectively,  and  two  coarse-grid 
system  matrices  A[  (dimension  175  x  36)  and  A2  (72  x 
36),  both  with  36  pixels,  are  also  defined  for  PET-1  and 
PET-2,  respectively.  SVD  analysis  shows  there  are  144 
and  84  nonzero  singular- values  for  matrices  Ai  and  A2, 
respectively,  which  indicate  that  N(Ai)  is  empty,  i.e., 
N(Ai)  =  4>,  and  N(A2)  has  dimension  60  =  (144  -  84) 
[24]. 

Fig.  3  shows  three  synthesized  phantoms  or  images, 
which  are  called  image- 1,  image-2,  and  image-3.  Image- 1 
primarily  contains  low-frequency  components,  while  im- 
age-2  and  image-3  are  explicitly  chosen  to  illustrate  the 
importance  of  the  “local  smoothness”  property  defined 
below  (image-2  does  not  have  this  property;  image-3 
does).  Three  different  initial  conditions  are  investigated: 
1)  zero  initial  condition,  in  which  every  pixel  is  zero,  2) 
average  initial  condition,  in  which  every  pixel  is  initial¬ 
ized  to  be  the  same  value,  which  is  total  counts  divided 
by  the  number  of  pixels,  and  3)  multigrid  initial  condi¬ 
tion,  which  comes  f^rom  the  interpolation  of  the  minimum- 
norm  least-squares  solution  of  the  coarse-grid  system. 
Note  that  the  zero  initial  condition  cannot  be  used  for  the 
MLEM  iteration  [5]. 

B.  Results  for  Small-Scale  Systems 

Figs.  4  and  5  show  the  performance  for  the  Landweber, 
generalized  Landweber,  ART,  and  MLEM  iterations  with 
the  three  images  in  Fig.  3  and  three  types  of  initial  con¬ 
ditions  (zero,  average,  multigrid)  for  PET-1  and  PET-2, 
respectively.  Each  curve  shows  results  up  to  1000  itera¬ 
tions.  In  Fig.  5,  there  are  also  some  horizontal  lines  which 
represent  the  results  of  setting  p,  =  1  /  aj  in  (27).  Any  one 
of  these  represents  a  least-squares  solution  made  up  of  I) 
the  minimum-norm  least-squares  solution  (8),  which  de¬ 
pends  on  the  original  image  vector  jc,  and  2)  the  null  space 
component  xf ,  which  is  the  projection  of  the  initial  con¬ 
dition  (zero,  average,  or  multigrid)  Xq  onto  NiA2).  These 
lines  serve  as  lower  bounds  on  Euclidean  distance  for  the 
least-squares  type  algorithms:  ART  (see  corollary  9  of 
Tanabe  [23]);  Landweber  [see  (19)];  and  generalized 
Landweber  [see  (27)].  The  bounds  are  inapplicable  for  the 
MLEM  algorithm.  Note  that  the  MLEM  algorithm  with 
average  initial  condition  breaks  some  of  these  bounds  in 
Fig.  5(a). 

In  Fig.  5(c)  [image-3  in  PET-2],  the  multigrid  imple¬ 
mentations  of  all  algorithms  break  the  lower  bounds  with 


Fig.  3.  (a)  Image-1  (b)  iinage-2.  and  (c)  image-3.  Each  image  is  scaled  to 
have  black  indicating  the  biggest  magnitude,  and  white  indicating  the 
smallest  magntiude.  Image-I  has  no  high-frequency  components,  image-2 
does  not  have  the  local  smoothness  property,  and  image-3  does  have  the 
local  smoothness  property. 


zero  and  average  initial  conditions!  In  fact,  the  coarse- 
grid  iteration  has  recovered  some  image  features  lying  in 
the  null  space  NiAi)  of  the  fine  grid. 

Since  the  coarse-grid  and  fine-grid  system  geometries 
differ,  the  coarse-grid  solution  after  interpolation  may  in¬ 
clude  components  lying  in  the  null  space  of  the  fine-grid 
system  matrix.  In  this  example,  the  result  of  the  coarse- 
grid  iteration  furnishes  extra  information.  The  coarse-grid 
iteration  basically  estimates  the  value  of  a  coarse-grid 
pixel  representing  its  four  neighboring  fine-grid  pixels, 
which  have  the  same  magnitude  in  image-3,  and  the  in¬ 
terpolation  does  not  violate  any  edge  or  boundary  of  this 
particular  image.  Therefore,  some  high-frequency  com- 
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Fig.  4.  Euclidean  distance  between  reconstnicted  and  actual  iinages  veisus 
iteration  number  in  PET-1  for  (a)  image- 1,  (b)  image-2,  and  (c)  iiiiage-3 
for  different  algorithms.  LAND(O):  Landweber  iterations  with  zero  initial 
condition,  G-LAND(A):  generalized  Landweber  iteration  with  average  ini¬ 
tial  condition,  LB(M):  Euclidean  distance  lower  bound  for  multigrid  initial 
condition,  etc. 
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Fig.  S.  Euclidean  distaiKe  between  reconstnicted  and  actual  images  versus 
iteration  number  in  PET-2  for  (a)  image-l,  (b)  iouge-2,  and  (c)  image-3 
for  different  algorithms. 


ponents  may  be  picked  up  from  this  multigrid  procedure 
for  image-3. 

We  now  define  a  local  smoothness  property:  the  values 
of  neighboring  four  fine-grid  pixels  to  be  grouped  to¬ 


gether  as  a  coarse-grid  pixel  are  close  to  each  other.  Im- 
age-3  has  this  property,  and  the  convergence  of  its  high- 
frequency  components  is  significantly  accelerated  using 
the  multigrid  implementation.  The  improvement  in  qual- 
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ity  of  the  reconstructed  image  for  image-3  in  Fig.  5(c)  can 
be  seen  by  comparing  Fig.  6(a).  the  convergent  result  of 
using  zero  initial  condition  [by  setting  p,  =  \  / a',  \n  (27)], 
and  Fig.  6(b),  the  convergent  result  of  using  the  multigrid 
initial  condition  (also  by  setting  p,  =  1/cr,').  Fig.  6(b) 
more  closely  resembles  the  original  image  Fig.  3(c). 

C.  Discussion  of  Results  for  Small-Scale  Systems 

We  now  summarize  the  results  in  Figs.  4  and  5,  and 
discuss  them  with  the  aid  of  decompositions  or  represen¬ 
tations  of  the  images  on  the  singular-vectors  v,  of  both  /I, 
and  A2.  Since  the  computation  in  one  generalized  Land- 
weber  iteration  with  the  polynomial  function  F(\)  in  (34) 
is  almost  equivalent  to  7  Landweber,  ART,  or  MLEM 
iterations,  the  abscissae  indicating  iteration  number  in 
Figs.  4  and  5  do  not  show  actual  computation  time  for  the 
generalized  Landweber  iteration;  since  the  abscissae  have 
logarithmic  scales,  the  generalized  Landweber  curves  may 
simply  be  shifted  to  the  right  to  reflect  the  actual  com¬ 
putation  required.  The  discussion  below  is  based  on 
amount  of  computation,  not  number  of  iterations. 

For  zero  or  average  initial  condition  (see  Figs.  4  and 
5). 

1)  The  Landweber  iteration  is  usually  the  slowest  of 
the  four  methods. 

2)  The  MLEM  iteration  is  usually  slower  than  ART, 
but  faster  than  the  Landweber  iteration.  However,  in  some 
cases,  such  as  image- 1  and  image-3  in  PET- 1  [see  Fig. 
4(a)  and  (c)],  the  MLEM  iterations  with  average  initial 
condition  [in  Fig.  4(a) ]  and  with  average  and  multigrid 
initial  conditions  [in  Fig.  4(c)]  converge  faster  than  the 
generalized  Landweber  and  ART  iterations,  and  the 
MLEM  iteration  with  average  initial  condition  in  the  case 
of  image-1  in  PET-2  [see  Fig.  5(a)]  can  even  break  the 
least-squares  lower  bounds. 

In  order  to  understand  the  convergence  behavior  in 
Figs.  4  and  5,  the  three  images  in  Fig.  3  are  decomposed 
into  components  on  the  singular-vectors  v,  of  both  sys¬ 
tems.  The  decompositions  for  PET-1  and  PET-2  are 
shown  in  Figs.  7  and  8,  respectively.  There  are  144  sin¬ 
gular-vectors,  ordered  by  the  magnitudes  of  the  singular- 
values  from  right  to  left  in  the  abscissa,  with  index  144 
for  the  singular-vector  with  the  largest  singular-value,  and 
with  index  1  for  the  singular-vector  with  the  smallest  sin¬ 
gular-value.  If  the  image  has  high  frequency  components, 
i.e.,  nonzero  and  large  components  on  the  left,  the  itera¬ 
tion  tends  to  take  longer  to  converge  [see  also  Fig.  1(a) 
and  Fig.  2(a)]. 

For  the  multigrid  initialization  (see  Figs.  4  and  5). 

1)  In  Fig.  5(a),  the  reconstructed  image  hardly  changes 
in  the  course  of  iteration  for  image- 1,  its  error  lying  just 
above  the  least-squares  bound.  This  suggests  that  the  ini¬ 
tial  condition  may  be  very  close  to  a  least-squares  solu¬ 
tion. 

2)  In  Fig.  4(c)  and  Fig.  5(c),  the  multigrid  iteration 
accelerates  the  convergence  rate  dramatically  for  image-3 
in  both  systems; 


(b) 


Fig.  6.  Reconstructed  images  using  (a)  zero  and  (b)  muitigrid  initial  con¬ 
ditions  on  image-3  in  PET-2.  Each  image  is  scaled  to  have  black  indicating 
the  biggest  magnitude,  and  white  indicating  the  smallest  magnitude. 


3)  In  Fig.  4(a),  (b)  and  Fig.  5(b),  the  multigrid  method 
does  not  show  any  significant  improvement  over  single¬ 
grid  algorithms  for  image- 1  or  image-2. 

D.  Comparison  of  Results  for  Small-Scale  Systems 

With  the  aid  of  the  decompositions  of  the  multigrid  and 
average  initial  conditions  for  PET-1  and  PET-2  shown  in 
Figs.  9  and  10,  respectively,  we  can  make  the  following 
observations. 

1)  Refer  to  the  decompositions  of  average  initial  con¬ 
ditions  in  PET-1  and  PET-2  in  Fig.  9(d)-(f)  and  Fig. 
l(Kd)-(f),  resp)ectively.  Since  the  average  initial  condi¬ 
tions  only  contain  low-frequency  components  (on  the  sin¬ 
gular-vectors  with  biggest  singular- values),  and  these 
components  may  be  recovered  after  the  first  one  or  two 
iterations  (see  Table  I),  using  the  average  initial  condition 
may  not  speed  up  the  convergence. 

2)  Compare  the  decomposition  of  image- 1  in  PET-1 
[see  Fig.  7(a)]  to  the  decomposition  of  the  multigrid  ini¬ 
tial  condition  for  image- 1  in  PET-1  or  its  corresponding 
multigrid  decomposition  [see  Fig.  9(a)].  The  muitigrid 
initial  condition  not  only  contains  some  low-frequency 
components,  but  also  introduces  some  high-frequency 
“noisy”  components  which  will  take  many  iterations  to 
be  corrected.  In  Fig.  4(a),  the  multigrid  Landweber  and 
multigrid  MLEM  implementations,  after  several  itera¬ 
tions,  are  slower  than  their  counterparts  with  zero  or  aver- 
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Fig.  7.  The  decompositions  of  (a)  image-1,  (b)  image-2,  and  (c)  iinage-3 
on  the  singular- vecton  n,  of  ^i. 


age  initial  condition,  and  the  multigrid  generalized  Land- 
weber  iteration  is  slower  than  its  counterpart  with  zero  or 
average  initial  condition. 

3)  Compare  the  decomposition  of  image- 1  in  PET-2 
[see  Fig.  8(a)]  to  its  corresponding  multigrid  decompo- 
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Fig.  8.  The  decompositioiia  of  (a)  image-l,  (b)  iinage-2,  and  (c)  image-3 
on  the  singular-vecton  tt,  of  A^. 

sition  [see  Fig.  l(Xa)].  The  multigrid  initial  condition, 
which  is  close  to  a  least-squares  solution,  has  a  compo¬ 
nent  similar  to  the  minimum-norm  least-squares  solution 


Compooent  yasnitude 


PAN  AND  YAGL£:  NL'MERJCAL  STTjDY  OF  MULTIGRiD  IMPLEMENTATIONS 


0  20  40  60  80  100  120  140  160  0  20  40  60  80  100  120  140  160 

Singular  Value  Index  Singular  Value  Index 


(c)  (f) 

Fig.  9.  The  decompositions  of  the  muitigrid  initial  conditions  of  (a)  im- 
age-1 ,  (b)  image-2,  and  (c)  image-3,  and  the  decompositions  of  the  average 
initial  conditions  of  (d)  image- 1,  (e)  image-2,  and  (f)  image-3  on  the  sin- 
guiar-vecton  Vj  of  A,. 
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represented  by  the  elements  with  indices  from  61  to  144, 
and  also  has  a  component  represented  by  the  elements  with 
indices  from  1  to  60  in  .VM:).  This  moves  each  recon¬ 
structed  image  .r*  a  distance  away  from  the  true  solution 
[see  Fig.  5(a)]. 

4)  Compare  the  decomposition  of  image-2  in  PET-1 
[see  Fig.  7(b)]  to  its  corresponding  multignd  decompo¬ 
sition  [see  Fis.  9(b)].  and  the  decomposition  of  image-2 
in  PET-2  [see  Fig.  8(b)]  with  its  corresponding  multignd 
decomposition  [see  Fig.  10(b)].  The  multigrid  initial  con¬ 
ditions  do  not  recover  most  of  the  components,  and  do 
not  help  to  accelerate  the  convergence  rate  for  image-2  in 
both  systems.  The  reason  for  this  is  as  follows.  Every 
coarse-grid  pixel  is  split  into  4  equivalent  parts  to  initial¬ 
ize  the  fine-grid  iteration.  However,  since  image-2  does 
not  have  the  local  smoothness  properly  defined  above,  this 
fine-grid  initialization  does  not  resemble  the  actual  image. 

5)  Compare  the  decomposition  of  image-3  in  PET-1 
[see  Fig.  7(c)]  to  its  corresponding  multigrid  decompo¬ 
sition  [see  Fig.  9(c)],  and  the  decomposition  of  image-3 
in  PET-2  [see  Fig.  8(c)j  with  its  corresponding  multigrid 
decomposition  [see  Fig.  10(c)].  The  multigrid  initial  con¬ 
ditions  not  only  contain  some  low-frequency  components 
for  image-3  in  both  systems,  but  also  contain  some  high- 
frequency  components  in  PET- 1  and  some  components  in 
the  null  space  iV(^,)  of  PET-2.  Therefore  using  the  mul¬ 
tigrid  method  can  greatly  accelerate  the  convergence  rate 
for  image-3,  which  does  have  the  local  smoothness  prop¬ 
erty.  Also  since  N(A2)  is  not  empty,  it  is  impossible  for 
the  iterations  with  zero  initial  condition  to  catch  up  to  the 
multignd  iterations  for  image-3  in  PET-2. 

E.  Conclusions  from  Results  for  Small-Scale  Systems 

The  decompositions  in  Figs.  7-10  show  that  the  mul¬ 
tigrid  method  accelerates  the  convergence  rate  for  high- 
frequency  components  if  the  image  has  the  local  smooth¬ 
ness  property,  as  image-3  does. 

The  multigrid  method  does  not  improve  the  conver¬ 
gence  rate  when  the  image  has  no  high-frequency  com¬ 
ponents,  as  in  image- 1 ,  or  does  not  have  the  local  smooth¬ 
ness  property,  as  in  image-2.  Note  that  any  image  will 
have  the  local  smoothness  property  if  enough  pixels  are 
used  to  describe  it;  conversely,  the  multigrid  method  could 
be  used  to  identify  images  described  by  too  many  pixels. 

The  advantage  of  using  the  multigrid  method,  from  our 
simulation  results,  is  its  capability  of  recovering  high-fre¬ 
quency  components  more  quickly,  if  an  image  has  the  lo¬ 
cal  smoothness  property.  It  is  true  that  if  the  coarse-grid 
iteration  recovers  some  low-frequency  components  of  the 
image  for  the  subsequent  fine-grid  iteration,  then  the  fine- 
grid  iteration  will  perform  better  in  the  reconstruction  of 
the  low-frequency  components  in  its  first  several  itera¬ 
tions.  However,  the  fine-grid  iteration  itself  may  be  very 
capable  of  recovering  low-frequency  components,  and  the 
benefit  of  using  the  coarse-grid  iteration  to  effectively  re¬ 
cover  low  frequencies  in  the  multigrid  method  may  soon 
disappear  after  the  first  several  fine-grid  iterations.  Then 


the  single-gnd  iteration  will  be  as  good  as  the  multisnd 
iteration.  The  use  of  two  different  gnd  geometnes  (coarse 
and  fine)  may  result  in  the  recovery  of  elements  in  the  null 
space  of  the  fine-gnd  system  matrix.  In  terms  of  frequen¬ 
cies.  this  can  be  interpreted  by  noting  that  null  space  com¬ 
ponents  can  be  viewed  as  infinitely  high-frequencv  com¬ 
ponents  (i.e..  zero  singular-values),  which  can  never  be 
recovered  by  using  the  fine-grid  iteration  with  zero  initial 
condition. 

In  some  cases,  the  coarse-gnd  iteration  may  be  unable 
to  recover  high-frequency  components,  and  it  may  even 
introduce  some  incorrect  high-frequency  components! 
Then  the  multigrid  method  should  show  no  sign  of  being 
able  to  accelerate  the  convergence  rate  after  the  first  sev¬ 
eral  iterations.  This  may  help  to  explain  why  the  paths  of 
convergence  for  the  multigrid  Landweber  and  multignd 
MLEM  iterations  in  Fig.  4(a)  start  better  than  the  single¬ 
grid  implementations  in  the  first  several  iterations,  and 
then  become  worse  after  about  the  ninth  iteration  in  the 
Landweber  and  the  fourth  in  the  MLEM. 

We  refer  to  this  as  the  “starting  good  but  ending  bad" 
phenomenon,  which  is  due  to  a  nonproper  initial  condi¬ 
tion  setting  for  the  fine-grid  iteration.  This  may  explain  a 
result  in  [14],  in  which  using  the  image  from  FBP  to  in¬ 
itialize  the  conjugate  gradient  iteration  [14]  did  not  nc- 
celerate  the  convergence  rate  of  the  iteration  (see  [14,  Fig. 
4]).  Despite  the  good  starting  point,  the  conjugate  gra¬ 
dient  method  showed  this  phenomenon  after  the  75th  it¬ 
eration.  This  may  be  because  the  FBP  initial  condition 
might  have  introduced  some  high-frequency  ‘noisy” 
components  into  the  reconstructed  image,  which  domi¬ 
nated  the  image  in  the  iteration  after  the  low-frequency 
components  had  been  recovered  in  the  first  75  iterations. 

F.  Experiments  on  a  Large  Scale  System 

A  geometry  similar  to  the  one  in  [5],  with  128  detectors 
and  a  square  array  of  4096  ( =  64  x  64)  fine-grid  pixels 
circumscribed  by  the  detector  ring,  was  used  as  a  more 
realistic  large-scale  system  to  verify  the  conclusions  from 
the  previous  small  scale  systems.  The  object  support  was 
assumed  to  be  in  the  inscribed  circle  of  the  square  array , 
which  accounts  for  3228  fine-grid  pixels  or  812  coarse- 
grid  pixels,  determined  by  counting  the  center  positions 
of  fine  or  coarse  grids  inside  the  inscribed  circle.  Again 
neglecting  projections  that  did  not  intersect  any  pixel  in 
the  object  support,  we  found  that  the  corresponding  fine- 
grid  system  geometry  was  modeled  by  a  system  matrix 
of  4160  X  3228,  and  the  coarse-grid  system  by  A]  of  4152 
X  812. 

The  Shepp-Logan  phantom  [12]  shown  in  Fig.  11(a)  is 
used  as  the  object.  This  phantom  can  be  seen  to  satisfy 
the  local  smoothness  property,  except  at  its  boundaries. 
The  multigrid  procedure  was  identic^  to  that  used  on  the 
small-scale  systems,  with  the  exception  that  the  coarse- 
grid  convergent  solution  was  obtained  using  a  conjugate 
gradient  iteration  [14],  [25],  to  ensure  convergence  after 
a  finite  number  of  iterations  [25]  (812  here).  The  result  of 
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Fig.  11.  (a)  Shepp-Logan  phantom  and  fb)  multignd  initial  condition.  Each 
image  is  scaled  to  have  black  indicating  the  biggest  magnitude,  and  white 
indicating  the  smallest  magnitude. 


the  coarse-grid  iteration  was  investigated.  It  was  found 
that  the  reconstructed  image  of  the  coarse-grid  iteration 
did  not  change  much  (in  terms  of  Euclidean  distance  be¬ 
tween  two  consecutive  images)  after  about  16  iterations, 
which  are  almost  equivalent  in  computation  to  four  fine- 
grid  iterations.  Note  one  single  conjugate-gradient  itera¬ 
tion  needs  two  forward  and  two  backward  projections 
[14].  So  the  four  fine-grid  conjugate-gradient  iterations 
are  about  equal  to  one  fine-grid  generalized  Landweber 
iteration,  or  eight  fine-grid  Landweber,  ART,  or  MLEM 
iterations.  The  convergent  image  after  interpolation  is 
shown  in  Fig.  1 1(b),  which  was  used  as  the  multigrid  ini¬ 
tial  condition  for  all  fine-grid  iterations. 

The  simulation  results  are  shown  in  Fig.  12.  The  Land¬ 
weber,  ART,  and  MLEM  iterations  all  benefit  from  the 
multigrid  implementation,  even  with  eight  extra  for  Land¬ 
weber,  ART,  and  MLEM  fine-grid  iterations  included  to 
account  for  the  coarse-grid  iterations.  Only  the  general¬ 
ized  Landweber  iteration  does  not  improve  its  conver¬ 
gence  behavior.  The  reason  is  that  the  multigrid  initial 
condition  contains  only  the  frequency  components  with 
singular  values  of  cr  >  0.2,  and  these  will  disappear  after 
only  three  generalized  Landweber  iterations  [see  Fig. 
2(c)].  If  the  phantom  is  represented  as  an  image  of  128  x 
128  pixels,  as  in  [22],  then  the  local-smoothness  property 
will  become  more  prominent  and  the  convergence  rate 
may  be  accelerated  even  more. 

One  interesting  point  about  ART  is  that  we  do  not  get 
quite  the  same  convergence  behavior  as  we  saw  in  the 
small  scale  systems  PET-1  and  PET-2.  Here  [see  Fig. 
12(c)],  the  ART  iteration  with  zero  initial  condition  con¬ 
verges  considerably  slower  than  the  same  iteration  with 
average  or  multigrid  initial  condition  in  the  first  40  itera¬ 
tions.  The  reason  may  be  that  the  average  initial  condition 
(i.e.,  every  pixel  is  positive)  is  geometrically  closer  to  the 
true  object  (the  Shepp-Logan  phantom,  whose  pixels  are 
all  positive)  than  the  zero  initial  condition.  T^  makes 
the  iteration  much  faster. 


portents  of  the  solution  are  effectively  approximated  and 
recovered  on  the  coarse  grids  while  the  high-frequency 
components  are  recovered  on  the  fine  grids."  The  first 
part  of  the  statement  about  one  of  the  capabilities  of  the 
coarse-grid  iteration  is  correct.  However,  the  potential  ca¬ 
pabilities  of  the  multigrid  method  to  recover  high-fre¬ 


quency  components  in  the  case  of  local  smoothness,  and 
to  introduce  incorrect  high-frequency  components,  were 
not  mentioned.  The  second  pan  of  the  statement  about  the 
role  played  by  the  fine-grid  iteration  is  also  correct.  How¬ 
ever,  this  role  should  not  be  emphasized  in  using  the  mul¬ 
tigrid  method  for  the  following  three  reasons. 

1)  For  the  Landweber  and  generalized  Landweber  it¬ 
erations,  the  third  sums  in  (19)  and  (27)  show  that  the 
initial  condition  dies  out  geometrically  [this  is  illustrated 
in  Fig.  1(c)  and  Fig.  2(c)].  This  suggests  that  the  low- 
frequency  components  recovered  by  the  coarse-grid  iter¬ 
ation  do  not  remain  unaltered  during  the  fine-grid  itera¬ 
tion.  Hence  the  fine-grid  iteration  must  do  more  than  re¬ 
construct  high-frequency  components. 

2)  The  convergence  rate  of  the  fine-grid  iteration  in  re¬ 
covering  high-frequency  components  is  not  good,  unless 
the  local  smoothness  property  also  holds  (as  it  might  be 
in  the  example  used  in  [22]). 

3)  Because  of  the  nonuniform  convergence  property 
noted  in  Section  HI,  it  is  likely  that,  for  images  without 
high-frequency  components,  after  the  first  several  itera¬ 
tions  the  single-grid  iteration  may  be  as  good  as  the  mul¬ 


tigrid  iteration,  or  even  better  if  some  incorrect  high  fre¬ 


quency  components  were  introduced  by  the  coarse-grid 
iteration. 

In  interpreting  the  results  of  [22]  in  light  of  the  guide¬ 
lines  proposed  in  Section  V-E,  the  following  points  should 
be  noted. 

1)  The  original  image  was  represented  by  128  x  128 
pixels,  which  benefits  the  multigrid  method,  since  the  fine 
sampling  induces  the  local  smoothness  property . 

2)  The  number  of  iterations  used  was  small;  “four” 
fine-grid  iterations.  This  again  shows  the  multigrid 
method  in  its  best  light.  Since  it  has  received  some  low- 
frequency  components  from  the  coarse-grid  iteration,  the 
subsequent  fine-grid  iteration  will  seem  to  have  effec¬ 
tively  recovered  more  low-ftequency  components  in  its 
first  several  iterations  than  the  single-grid  iteration.  The 
fine-grid  iteration  may  in  fact  have  not  done  much  for  the 
higb-ftequency  components  in  these  first  several  itera¬ 
tions. 

The  benefit  of  applying  the  MLEM  multigrid  approach 
on  the  Shepp-Logan  phantom  in  the  first  several  iterations 
can  also  be  seen  in  Fig.  12(d)L  Thus  the  results  of  [22] 
seem  to  be  in  agreement  with  the  guidelines  for  the  mul¬ 
tigrid  niediod.  However,  diis  paper  has  treated  several  dif¬ 
ferent  images,  and  multigrid  implementations  of  several 
differed  algorithms. 


G.  Comparison  to  the  Results  of  Ranganath 

In  [22],  Ranganath  et  al.  proposed  a  multigrid  MLEM 
algorithm.  They  claimed  that  "'the  low-frequency  com- 


VI.  Summary 

A  numerical  study  of  multigrid  implementations  of  sev¬ 
eral  iterative  image  reconstruction  algorithms  has  been 
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Fig.  12,  Euclidean  distance  between  reconstructed  and  actual  images  ver¬ 
sus  iteration  number  for  (a)  Landweber,  (b)  generalized  Landweber.  (c) 
ART,  and  (d)  MLEM,  Solid  line  is  for  zero  initial  condition,  dotted  line 
for  average,  and  dashed  line  for  multigrid. 
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presented.  The  generalized  Landweber  iterations  was 
shown  to  have  a  faster  convergence  rate  than  the  Land¬ 
weber  iteration  if  a  proper  shaping  matrix  is  used.  And 
the  generalized  Landweber  iteration  has  several  advan¬ 
tages  over  ART,  including  paiallelizability,  control  over 
convergence  rate,  and  ability  to  filter  the  data  or  image  as 
a  part  of  the  iteration.  The  effects  of  noise  and  initial  con¬ 
dition  on  the  convergence  rate  of  the  generalized  Land¬ 
weber  iteration  were  also  studied. 

The  multigrid  implementation  was  found  to  accelerate 
the  convergence  rate  of  high-frequency  components  of  the 
image  when  the  image  possessed  the  loc^  smoothness 
property.  In  other  cases  it  was  unhelpful,  and  may  even 
slow  down  the  convergence  rate.  Results  of  some  other 
papers  were  interpreted  in  light  of  the  conclusions  drawn 
here.  Unresolved  issues  include  determination  of  sam¬ 
pling  rates  that  will  ensure  the  local  smoothness  property, 
other  ways  of  initializing  the  iterations,  and  incorporation 


of  a  more  general  nonlinear  D  operator  (e.g.,  a  nonneg¬ 
ativity  constraint)  in  the  generalized  Landweber  iteration. 
We  believe  this  paper  has  presented  some  valuable  insight 
into  the  convergence  behavior  of  various  implementations 
of  various  iterative  algorithms. 
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Abstract 

The  numerical  behavior  of  multigrid  implementations  of 
the  Landweber,  generalized  Landweber,  ART,  and  MLEM 
iterative  image  reconstruction  algorithms  is  investigated. 
Comparisons  between  these  algorithms,  and  with  their 
single-grid  implementations,  are  made  on  one  small-scale 
synthetic  PET  system,  for  phantom  objects  exhibiting  dif¬ 
ferent  characteristics.  Original  contribution  is  numerical 
study  of  the  convergence  rates  of  single-grid  and  multigrid 
implementations  of  the  Landweber,  generalized  Landwe¬ 
ber,  ART,  and  MLEM  iterations. 

I.  INTRODUCTION 

Imtializaiton  of  an  iterative  algorithm  can  be  as  simple  as 
setting  all  pixels  to  zero,  or  as  complex  as  using  the  result 
of  filtered  back-projection  (EBP)  [1].  The  presumption 
behind  the  latter  approach  is  that  EBP  should  furnish  a 
starting  point  that  is  “close”  to  the  desirable  image,  after 
which  the  iteration  would  quickly  converge.  Surprisingly, 
this  approach  has  shown  little  success  [1].  In  Section  III 
our  numerical  results  lead  to  a  possible  explanation  of  why 
this  is  so.  We  also  examine  the  effects  of  different  initial 
conditions  on  the  convergence  behavior  of  different  itera¬ 
tive  algorithms. 

The  contribution  of  this  paper  is  a  numerical  study  of 
muHigrid  implementations  of  various  iterative  algorithms; 
Landweber  [2],  generalized  Landweber  [2],  ART  [3],  and 
MLEM  [4].  In  a  multigrid  implementation,  the  iteration 
is  first  used  on  a  coarse  grid  until  it  converges;  the  result 
is  then  interpolated  and  used  as  an  initial  condition  on 
a  fine  grid.  The  coarse  grid  iteration  requires  much  less 
computation  per  iteration.  Although  two  different  system 
geometry  matrices  are  needed  (one  for  each  grid),  both 
matrices  are  relatively  sparse,  so  the  additional  storage 
required  for  the  coMse-grid  system  matrix  is  minor. 
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14-90-J-1897. 


We  define  a  local  smoothness  property:  the  values  of 
four  neighboring  fine-grid  pixels  to  be  grouped  as  a  coarse- 
gnd  pixel  are  close  to  each  other.  Our  results  indicate 
that  if  the  image  has  this  property,  then  the  convergence 
of  high-frequency  components  of  the  image  can  be  signifi¬ 
cantly  accelerated  using  a  multigrid  implementation.  ’’Lo¬ 
cal  smoothness”  is  not  strictly  defined  here,  nor  does  it 
need  to  -  the  more  this  property  holds,  the  greater  the  ac-“ 
celeration  of  convergence.  If  the  local  smoothness  property 
does  not  hold,  or  if  there  are  no  high-frequency  compo¬ 
nents  in  the  image,  then  a  multigrid  implementation  does 
not  seem  to  speed  up  the  convergence  rate.  Ranganath  et 
al.  [5]  proposed  a  multigrid  implementation  of  the  MLEM 
algorithm,  and  gave  a  numerical  example.  In  this  paper, 
results  for  several  different  algorithms  are  given,  more  ex¬ 
amples  are  given,  and  more  conclusions  about  reconstruc¬ 
tion  behavior  are  made. 

The  paper  is  organized  as  follows.  Section  II  summarizes 
the  idea  of  a  multigrid  implementation  and  presents  the 
multigrid  implementation  adopted  in  this  work.  Section 
III  presents,  summarizes,  and  discusses  numerical  results, 
and  presents  some  conclusions  about  convergence  rates  of 
multigrid  implementations.  Section  IV  concludes  with  a 
summary. 

II.  The  MULTIGRID  METHOD 

The  purpose  of  applying  a  multigrid  method  to  image  re¬ 
construction  is  to  accelerate  the  iterative  reconstruction 
by  first  using  a  coeirse-grid  iteration  to  provide  an  ini¬ 
tial  condition  for  the  more  computationally  demanding 
fine-grid  iteration.  It  was  believed  [5]  that  low-frequency 
components  of  an  image  can  be  reconstructed  with  less 
effort  using  the  coarse-grid  iteration;  then  applying  the 
fine-grid  iteration  results  in  an  efficient  reconstruction  of 
high-frequency  components. 

Ranganath  ti  al.  [5]  used  a  multigrid  method  to  ac¬ 
celerate  the  convergence  rate  of  the  MLEM  algorithm.  It 
was  shown  that  this  multigrid-MLEM  approach  reduced 
the  computation  time  for  reconstructing  a  128  x  128  noisy 
Shepp-Logan  phantom,  using  a  sum  of  the  squared  errors 
criterion.  However,  the  convergence  behavior  of  the  multi¬ 
grid  method  for  image  reconstruction  in  general,  was  not 
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considered. 

Our  multigrid  implementation  is  set  up  as  follows.  A 
coarse-grid  is  defined,  where  the  center  of  a  coarse-grid 
pixel  is  located  at  the  center  of  the  four  neighboring  fine- 
grid  pixels.  Then  a  coarse-grid  system  matrix  associated 
with  the  coarse-grid  is  computed  and  stored.  Note  that 
the  set  of  projections  (elements  of  b)  for  the  coarse-grid 
geometry  usually  does  not  coincide  with  the  projections  of 
the  fine-grid  geometry. 

In  our  multigrid  method,  the  projection  data  are  first 
derived  from  a  noiseless  projection  of  a  synthetic  image 
through  the  fine-grid  system.  A  subset  of  projection  data 
which  corresponds  to  the  set  of  projections  of  the  coarse- 
grid  system  is  chosen  as  the  simulated  projections  from 
the  synthetic  image  for  the  coarse  grid  system.  For  con¬ 
servation  of  total  counts,  the  total  counts  of  the  coarse- 
grid  projection  data  will  be  normalized  such  that  the  total 
counts  of  the  coarse-grid  system  will  be  the  same  as  that 
of  the  fine-grid  system.  Since  we  are  investigating  to  what 
extent  the  multigrid  method  can  contribute  to  the  image 
reconstruction,  we  use  directly  the  minimum-norm  least- 
squares  solution  of  the  coarse-grid  system,  which  is  equiva¬ 
lent  to  the  convergent  solution  of  the  coarse-grid  iterations 
(Landweber,  generalized  Landweber  and  ART)  with  zero 
initial  condition.  Then  the  interpolation  is  applied  on  the 
coarse-grid  convergent  solution  to  make  a  fine-grid  image. 
A  positivity  constraint  and  another  conservation  of  total 
counts  are  applied  on  the  fine-grid  image  to  make  up  a 
fine-grid  initial  image.  Finally,  the  fine-grid  iteration  with 
this  fine-grid  initial  condition  is  executed  1000  times  for 
investigation  of  its  convergence  behavior. 

III.  Numerical  results 

A.  Experiments 

One  hypothetical  small-scale  PET  system  (PET-1),  with 
a  detector  ring  of  radius  1  and  a  centered  object  support 
of  1.2  X  1.2,  is  used.  The  detector  ring  is  equally  divided 
into  26  detectors;  and  no  gap  is  between  any  two  adjacent 
detectors.  For  applying  the  multigrid  method,  an  object 
will  be  pixelated  as  either  144(=  12  x  12)  fine-grid  pixels 
or  36(=  6x6)  coarse-grid  pixels. 

A  fine-grid  system  matrix  Ai  [6]  (dimension  195  x  144), 
with  144  pixels,  is  defined  for  PET-1,  and  a  coarse-grid 
system  matrix  A\  (dimension  175  x  36),  with  36  pixels,  is 
also  defined.  SVD  analysis  shows  there  are  144  non-zero 
singular-values  for  matrix  Ai,  which  indicate  that  the  null 
space  of  Ai  is  empty. 

Fig.  1  shows  three  images,  which  are  called  image-1, 
image-2  and  image-3,  separately.  Image-1  primarily  con¬ 
tains  low  frequency  components,  while  image-2  and  image- 
3  are  explicitly  chosen  to  illustrate  the  importance  of  the 
”  local  smoothness”  property  defined  before  (image-2  does 
not  have  this  property;  image-3  does).  Three  different  ini¬ 
tial  conditions  are  investigated:  1)  zero  initial  condition, 
in  which  every  pixel  is  zero;  2)  average  initicd  condition. 


in  which  every  pixel  is  initialized  to  be  the  same  value 
which  is  the  number  of  total  counts  divided  by  the  number 
of  pixels;  and  3)  multignd  initial  condition,  which  comes 
from  the  interpolation  of  the  minimum-norm  least-squares 
solution  of  the  coarse-grid  system.  Note  that  zero  initial 
condition  can  not  be  used  for  the  .MLEM  iteration  [4], 

B.  Results  and  Discussions 

Fig.  2  shows  the  performance  for  the  Landweber,  gener¬ 
alized  Landweber,  ART,  and  MLEM  iterations  with  the 
three  images  in  Fig.  1  and  three  types  of  initial  conditions 
(zero,  average,  multigrid)  for  PET-1. 

The  coarse-grid  iteration  basically  estimates  the  value  of 
a  coarse-grid  pixel  representing  its  four  neighboring  fine- 
grid  pixels,  which  have  the  same  magnitude  in  image-3, 
and  the  interpolation  does  not  violate  any  edge  or  bound¬ 
ary  of  this  particular  image.  Therefore,  some  high  fre¬ 
quency  components  are  picked  up  from  this  multigrid  pro¬ 
cedure  for  image-3. 

We  now  summarize  the  results  in  Fig.  2.  For  zero  or 
average  initial  condition  (see  Fig.  2): 

1)  The  Landweber  iteration  is  usually  the  slowest  of  the 
four  methods; 

2)  The  MLEM  iteration  is  usually  slower  than  ART,  but 
faster  than  the  Landweber  iteration.  However,  in 
some  cases,  such  as  image- 1  and  image-3  (see  Fig.  2(a) 
and  2(c)),  the  MLEM  iterations  with  average  initial 
condition  (in  Fig.  2(a))  and  with  average  or  multigrid 
initial  condition  (in  Fig.  2(c))  converge  faster  than  the 
generalized  Landweber  and  ART  iterations. 

In  order  to  understand  the  convergence  behavior  in 
Fig.  2,  the  three  images  in  Fig.  1  are  decomposed  into 
components  on  the  singular-vectors  u,-  of  A\,  shown  in 
Fig.  3(a)-(c).  There  are  144  singular-vectors,  ordered  by 
the  magnitudes  of  the  singular-values  from  right  to  left 
in  the  abscissa,  with  index  144  for  the  largest  singular- 
value  (lowest  frequency),  and  with  index  1  for  the  smallest 
singular- value  (highest  frequency).  If  the  image  has  high 
frequency  components,  i.e.  non-zero  and  large  components 
on  the  left,  the  iteration  tends  to  take  longer  to  converge. 
For  the  multigrid  initialization  (see  Fig.  2): 

1)  In  Fig.  2  (c),  the  multigrid  iteration  accelerates  the 
convergence  rate  dramatically  for  image-3; 

2)  In  Fig.  2  (a)  and  (b),  the  muitigrid  method  does  not 
show  Einy  significant  improvement  over  single-grid  al¬ 
gorithms  for  image-1  or  image-2. 

With  the  aid  of  the  decompositions  of  the  multigrid  and 
average  initial  conditions  for  PET-1  in  Fig.  3(d)-(i),  we  can 
make  the  following  observations: 

1)  Refer  to  the  decompositions  of  average  initial  condi¬ 
tions  in  Fig.  3(g)-(i).  Since  the  average  initial  con¬ 
ditions  only  contciin  low  frequency  components  (on 
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Figure  2:  Euclidean  distance  between  reconstructed  and  actual  images  vs.  iteration  number  for  (a)  image-1,  (b) 
image-2,  and  (c)  image-3  for  different  algorithms.  LAND(0):Landweber  iteration  with  zero  initial  condition,  G- 
LAND(A):generalized  Landweber  iteration  with  average  initial  condition,  etc. 


the  singular- vectors  with  biggest  singular- values),  and 
these  components  may  be  recovered  after  the  first  one 
or  two  (generalized)  Landweber  iterations,  using  the 
average  initial  condition  may  not  speed  up  the  conver¬ 
gence  for  the  (generalized)  Landweber  iteration.  For 
0  ART,  the  iteration  did  not  ciccelerate  for  these  three 

images  for  their  some  pixels  are  zero  or  very  close  to 
zero.  It  has  been  shown  [7]  that  ART  may  be  acceler¬ 
ated  by  a  factor  of  three  by  using  the  average  initial 
condition  for  the  phantom  with  most  of  the  pixels  pos¬ 
itive; 

2)  Compare  the  decomposition  of  image- 1  (see  Fig.  3  (a)) 
with  the  decomposition  of  the  multigrid  initial  condi¬ 
tion  for  image-1  (see  Fig.  3  (d)).  The  multigrid  ini¬ 
tial  condition  not  only  contains  some  low  frequency 
components,  but  also  introduces  some  high  frequency 
^  “noisy”  components  which  will  take  many  iterations 

to  be  corrected.  In  Fig.  2(a),  the  multigrid  Landweber 
^d  multigrid  MLEM  implementations,  after  several 
Iterations,  are  slower  than  their  counterpeuts  with  zero 
or  average  initial  condition,  and  the  multigrid  general¬ 


ized  Landweber  iteration  is  slower  than  its  counterpart 
with  zero  or  average  initial  condition; 

3)  Compare  the  decomposition  of  image-2  (see  Fig.  3  (b)) 
with  its  corresponding  multigrid  decomposition  (see 
Fig.  3  (e)).  The  multigrid  initial  condition  does  not 
recover  most  of  the  components,  and  do  not  help  to 
accelerate  the  convergence  rate  for  image-2.  The  rea¬ 
son  for  this  is  as  follows.  Every  coarse-grid  pixel  is 
split  into  4  equivzdent  parts  to  initialize  the  fine-grid 
iteration.  However,  since  image-2  does  not  have  the 
local  smoothness  property  defined  above,  this  fine-grid 
initialization  does  not  resemble  the  actual  image; 

4)  Compare  the  decomposition  of  image-3  (see  Fig.  3  (c)) 
with  its  corresponding  multigrid  decomposition  (see 
Fig.  3  (f)).  The  multigrid  initial  condition  not  only 
contains  some  low  frequency  components  for  image- 
3,  but  also  contains  some  high  frequency  components. 
Therefore  using  the  multigrid  method  cein  greatly  ac¬ 
celerate  the  convergence  rate  for  image-3,  which  does 
have  the  local  smoothness  property. 
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Figure  3;  The  decompositions  of  (a)  image- 1  (b)  image-2  and  (c)  image-3  on  the  singular-vectors  v,-  of  >li;  the  decom¬ 
positions  of  the  multigrid  initial  conditions  for  (d)  image-1  (e)  image-2  and  (f)  image-3;  and  the  decompositions  of  the 
average  initial  conditions  for  (g)  image- 1  (h)  image-2  and  (i)  image-3. 


C.  Conclusions 

The  decompositions  in  Fig.  3  show  that  the  multigrid 
method  accelerates  the  convergence  rate  for  high  frequency 
components  if  the  image  has  the  lociil  smoothness  prop¬ 
erty,  as  image-3  does.  The  multigrid  method  does  not 
improve  the  convergence  rate  when  the  image  has  no  high 
frequency  components,  as  in  image- 1,  or  does  not  have  the 
local  smoothness  property,  as  in  image-2.  Note  that  any 
image  will  have  the  local  smoothness  property  if  enough 
pixels  are  used  to  describe  it;  conversely,  the  multigrid 
method  could  be  used  to  identify  images  described  by  too 
many  pixels. 

The  advantage  of  using  the  multigrid  method,  from  our 
simulation  results,  is  its  capability  of  recovering  high  fre¬ 
quency  components  more  quickly,  if  an  image  has  the  lo¬ 
cal  smoothness  property.  It  is  true  that  if  the  coarse-grid 
iteration  recovers  some  low  frequency  components  of  the 
image  for  the  subsequent  fine-grid  iteration,  then  the  fine- 
grid  iteration  will  perform  better  in  the  reconstruction  of 
the  low  frequency  components  in  its  first  several  iterations. 
However,  the  fine-grid  iteration  itself  may  be  very  capable 
of  recovering  low  frequency  components,  and  the  benefit 


of  using  the  coarse-grid  iteration  to  effectively  recover  low 
frequencies  in  the  multigrid  method  may  soon  disappear 
after  the  first  several  fine-grid  iterations.  Then  the  single¬ 
grid  iteration  will  be  as  good  as  the  multigrid  iteration. 

In  some  cases,  the  co2irse-grid  iteration  may  be  unable 
to  recover  high  frequency  components,  amd  it  may  even  in¬ 
troduce  some  incorrect  high  frequency  components!  Then 
the  multigrid  method  should  show  no  sign  of  being  able  to 
accelerate  the  convergence  rate  after  the  first  several  iter¬ 
ations.  This  may  help  to  explain  why  the  paths  of  conver¬ 
gence  for  the  multigrid  Landweber  and  multigrid  MLEM 
iterations  in  Fig.  2  (a)  start  better  than  the  single-grid  im¬ 
plementations  in  the  first  several  iterations,  and  then  be¬ 
come  worse  after  about  the  ninth  iteration  in  the  Lamdwe- 
ber  and  the  fourth  in  the  MLEM. 

We  refer  to  this  as  the  “starting  good  but  ending  bad” 
phenomenon,  which  is  due  to  a  non-proper  initial  condi¬ 
tion  setting  for  the  fine-grid  iteration.  This  may  explain  a 
result  in  [1],  in  which  using  the  image  from  FBP  to  initial¬ 
ize  the  conjugate  gradient  iteration  [1]  did  noi  accelerate 
the  convergence  rate  of  the  iteration  (see  Fig.  4  of  [1]). 
Despite  the  good  starting  point,  the  conjugate  gradient 
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method  showed  this  phenomenon  after  the  75th  iteration. 
This  may  be  because  the  FBP  initial  condition  might  have 
introduced  some  high  frequency  “noisy”  components  into 
the  reconstructed  image,  which  dominated  the  image  in 
the  iteration  after  the  low  frequency  components  had  been 
recovered  in  the  first  75  iterations. 

p.  Comparison  with  Ranganath  s  Results 

In  [5]  Ranganath  ei  al.  proposed  a  multigrid  MLEM  algo¬ 
rithm.  They  claimed  that  '‘the  low  frequency  components 
of  the  solution  are  effectively  approximated  and  recovered 
on  the  coarse  grids  while  the  high  frequency  components  are 
recovered  on  the  fine  grids" .  The  first  part  of  the  statement 
about  one  of  the  capabilities  of  the  coarse-grid  iteration  is 
correct.  However,  the  potential  capabilities  of  the  multi- 
grid  method  to  recover  high  frequency  components  in  the 
case  of  local  smoothness,  and  to  introduce  incorrect  high 
frequency  components,  were  not  mentioned.  The  second 
part  of  the  statement  about  the  role  played  by  the  fine- 
grid  iteration  is  also  correct.  However,  this  role  should 
not  be  emphasized  in  using  the  multigrid  method  for  the 
following  three  reasons: 

1)  For  the  Landweber  and  generalized  Landweber  itera¬ 
tions,  the  low  frequency  components  recovered  by  the 
coarse-grid  iteration  do  not  remain  unaltered  during 
the  fine-grid  iteration  [6].  Hence  the  fine-grid  iter¬ 
ation  must  do  more  than  reconstruct  high-frequency 
components; 

2)  The  convergence  rate  of  the  fine-grid  iteration  in  re¬ 
covering  high  frequency  components  is  not  good,  un¬ 
less  the  local  smoothness  property  also  holds  (as  it 
might  be  in  the  example  used  in  [5]); 

3)  It  is  likely  that,  for  images  without  high  frequency 
components,  after  the  first  several  iterations  the 
single-grid  iteration  may  be  as  good  as  the  multigrid 
iteration,  or  even  better  if  some  incorrect  high  fre¬ 
quency  components  were  introduced  by  the  coarse- 
grid  iteration. 

In  interpreting  the  results  of  [5]  in  light  of  the  guidelines 
proposed  in  Subsection  III.C,  the  following  points  should 
be  noted: 

1)  The  original  image  was  represented  by  128  x  128  pixels, 
which  benefits  the  multigrid  method,  since  the  fine 
sampling  induces  the  local  smoothness  property; 

2)  The  number  of  iterations  used  was  small:  “four” 
fine-grid  iterations.  This  again  shows  the  multigrid 
method  in  its  best  light.  Since  it  has  received  some 
low  frequency  components  from  the  coarse-grid  iter¬ 
ation,  the  subsequent  fine-grid  iteration  will  seem  to 
have  effectively  recovered  more  low  frequency  compo¬ 
nents  in  its  first  several  iterations  than  the  single-grid 
iteration.  The  fine-grid  iteration  may  in  feict  have  not 


done  much  for  the  high  frequency  components  in  these 
first  several  iterations. 

The  benefit  of  applying  the  MLEM  multigrid  approach 
on  the  Shepp-Logan  phantom  in  the  first  several  iterations 
can  also  be  seen  in  [6].  Thus  the  results  of  [5]  seem  to  be 
in  agreement  with  the  guidelines  for  the  multigrid  method. 

IV.  Summary 

A  numerical  study  of  multigrid  implementations  of  several 
iterative  image  reconstruction  algorithms  has  been  pre¬ 
sented.  The  multigrid  implementation  was  found  to  accel¬ 
erate  the  convergence  rate  of  high-frequency  components  of 
the  image  when  the  image  possessed  the  local  smoothness 
property.  In  other  cases  it  was  unhelpful,  and  may  even 
slow  down  the  convergence  rate.  Results  of  some  other 
papers  were  interpreted  in  light  of  the  conclusions  drawn 
here. 
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APPENDIX  I 
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This  paper  is  a  minor  work  that  uses  group  theory  to  aid  in  designing  oscillator 
circuits.  The  application  is  quite  unusual  and  novel. 
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On  Upper  Bounds  of  the  Equivalent  Oscillator 
and  Notch-Filter  Circuits: 

A  Non-Commutative  Group  Theoretic  Approach 

P.  Raadhakrishnan.  Andrew  E.  agle.  B.  V.  Rao.  and 
John  E.  Dorband 

Abstract — In  this  paper  we  show  that  the  problem  of  synthesizing 
equivalent  oscillators  and  notch-Hlters  from  the  knowledge  of  a  given 
parent  circuit  is  identical  to  that  of  finding  the  isomorphisms  of  a 
non-commutative  group.  It  is  also  shown  that  many  of  the  results 
established  by  earlier  methods  can  easily  be  explained  using  the  theory- 
developed  in  this  paper.  The  results  derived  in  this  paper  show  that 
earlier  results  do  not  lead  to  the  best  possible  upper  bound.  The 
corrected  upper  bound  is  derived  by  counting  the  members  of  the 
non-commutative  group.  .4  well-known  family  of  circuits  is  used  to 
illustrate  the  theorv  developed. 

I.  Introduction 

In  the  recent  past  there  have  been  attempts  to  group  RC 
oscillators  and  notch-filters  so  that  additional  equivalent  circuits 
can  be  generated  from  the  knowledge  of  a  given  parent  circuit 
[7]-[14].  Among  these  efforts.  [7]-[!2]  are  applicable  only  to 
oscillator  circuits.  A  unified  synthesis  framework  for  oscillators 
and  notch-filters  alike  was  presented  in  [14],  However,  the 
results  reported  in  [15]  indicated  that  the  upper  bound  is  greater 
than  the  existing  bounds. 

In  this  paper  a  synthesis  method  based  on  non-commutative 
group  theory  that  extends  the  earlier  reported  results  on  upper 
bounds  is  presented.  Also  demonstrated  is  the  fact  that  the 
elements  of  the  non-commutative  structure  are  natural  exten¬ 
sion  of  the  results  reported  earlier  by  other  investigators.  The 
results  reported  in  this  paper  have  the  advantage  of  synthesizing 
a  stable  circuit  from  an  unstable  parent  circuit. 

Let  5  be  a  set  of  n  elements  .r,.  .t,.  .v A  1:  1  mapping 
4>  acting  on  5  is  represented  as 


where  is  the  image  of  x*.  under  <t  and  every  j:,*.  e  5.  A 
mapping  T>  is  called  decomposable  if  it  can  be  expressed  as  a 
composition  of  two  or  more  maps.  For  example,  <t>  =  4>,T>, 
means  that  the  action  of  4>  is  equivalent  to  the  action  of  (l>, 
followed  by  the  action  of  4>2. 

II.  Theoretical  Basis  of  the  Synthesis  Problem 

2. 1.  Derivation  of  the  Nern’ork  Aspects 

A  popular  generalized  canonical  RC  oscillator  (notch-filter) 
structure  is  shown  in  Fig.  1.  The  operational  amplifier  (OA) 
employed  in  this  circuit  is  assumed  to  be  ideal  with  infinite-gain 
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Fiv.  1.  Generalized  canonical  RC  network  under  slud\ 


mode,  and  /3  is  assumed  to  be  a  passive  RC  network.  1',^  and  1], 
are  the  input  and  the  output  voltages  of  the  circuit.  Since  O.A  is 
assumed  to  be  ideal  with  '-c  gain  .4.  tc  input  impedance,  and  zero 
output  impedance,  from  Fig.  1  it  follows  that  .4(  U,  -  I  ],)  = 
Since  A  is  infinite  and  is  finite.  1 -  f  j  =  0.  Hence,  the 
voltages  at  terminals  three  and  four  should  be  equal.  .As  a 
consequence,  any  network  connection  between  terminals  three 
and  four  is  redundant.  The  condition  that  the  input  impedance 
is  rc  forces  the  currents  /,  =  /j  =  0.  Let  y,^  denote  the  admit¬ 
tance  connected  between  the  nodes  i  and  j  of  the  given  passive 
network 

1)  Oscillator  Mode:  When  the  operating  mode  is  that  of 
self-sustained  oscillations.  =  Vq.  With  the  assumptions  made 
earlier,  the  following  identity  results  [15]: 


2)  Notch-Filter  Mode:  For  the  notch-filter  mode,  at  transmis¬ 
sion  null  frequency  wq,  the  output  voltage  Vq  is  zero  for  a 
nonzero  finite  input  voltage  This  condition,  along  with  the 
additional  observations  made  earlier,  results  in 


Let  A  be  a  set  defined  as  A:  {S,,  62,  yi,  Yy}  where  the  elements 
of  A  satisfy  the  constraint 

0,  «  02  -  7i  *  72  =  0- 

Set  A  can  be  used  to  denote  the  equations  (2)  or  (3)  due  to  their 
structural  similarity.  Depending  on  whether  the  circuit  is  an 
oscillator  or  a  notch-filter,  A  will  take  values  from  set 
{v3i.>'44.>'4h>'33}  ot  sct  {>32.  >'44>  >’42- >33}-  respectively.  This  ob¬ 
servation  leads  to  a  unified  approach  to  the  synthesis  problem  as 
described  in  the  next  section. 

III.  Relationship  Between  Group  Isomorphisms  of  A 
AND  Synthesis  of  Equivalent  Circuits 

Consider  set  A:  {0,,  02.  y^,  72)  where  the  elements  of  A  satisfy 
constraint  (4),  which  is  called  the  reduced  identity  in  the  context 
of  circuit  theory  [14].  Since  this  identity  is  common  to  all  the 
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oscillators  (notch-filters)  that  can  be  represented  by  the  two-port 
structure  in  Fig.  1.  any  isomorphism  of  A  that  leaves  (4)  un¬ 
changed  will  lead  to  an  additional  oscillator  (notch-rilter)  circuit. 
Hence,  set  5  of  all  isomorphisms  of  A  that  leaves  constraint  (4) 
unchanged  will  enable  one  to  find  the  upper  bounds  on  the 
additional  non-trivial  equivalent  circuits.  In  order  to  illustrate 
this  point,  let  <F,  be  an  element  of  set  S.  .Action  of  <t>,  on  A  will 
lead  to  (F,(A):  {'FAhJ.  <F,(y;).  'F  (-y,)}.  Constraint  i4)  is 

changed  to  the  form 

(pq  )  <  <t>,(  H-  )  -  <F  (  y.  )  --  (F  (y- )  =  0.  (5) 


Since  A  has  only  four  elements,  the  following  are  all  the  possible 
invariant  isomorphisms: 


[72  7i  ®2  ^1  j’  \7i  72  ^2  ^1  /' 

/e.  ^2  7i  72  \ 

[72  7i  ^1  f'c  / 


The  followiing  interesting  properties  are  observed  from  the  eight 
isomorphisms  derived: 

<t)|  =  e  =  identity  mapping. 


Since  the  elements  ot  S  are  self  inverses,  any  additional  combi¬ 
nation  of  the  maps  can  be  reduced  to  one  of  the  above  given 
forms.  Hence,  S  is  closed  under  the  composition  operation.  .A 
careful  analysis  shows  that  a  composition  of  anv  two  elements 
from  different  subsets  5,.  S,.  and  5,  is  noncommutative.  The 
composition  operation  plays  the  role  of  the  binary  operation  for 
the  group  5.  A  number  of  nontrivial  new  solutions  to  condition 
(4)  can  now  be  computed  using  the  group  theoretic  approach  in 
more  than  one  way.  Given  that  the  elements  of  5  are  the 
restricted  isomorphisms  of  A.  the  number  of  nontrivial  addi¬ 
tional  solutions  is  equal  to  [{cardinality  of  S)  -  1]  (the  [-1] 
being  the  removal  of  the  identity  element  from  S).  Hence,  the 
number  of  additional  new  solutions  is  equal  to  seven. 

The  elements  of  A  take  values  from  the  two-port  admittances 
that  satisfy  (2)  for  the  oscillator  mode  (or  (3)  for  the  notch-filter). 
Therefore,  computing  all  possible  invariant  isomorphisms  of  A  is 
equivalent  to  finding  all  the  possible  different  two-port  admit¬ 
tance  transformations  that  leave  the  condition  for  oscillations 
unaffected.  Hence,  the  number  of  additional  equivalent  circuits 
is  equal  to  [{cardinality  of  5}  -  1], 

IV.  Circuit  Theoretic  Interpretation  of  the 

Properties  of  the  Non-Commutative  Group 

<f>;  is  an  Identity  Mapping:  After  having  chosen  set  A: 
{0|.  e,,  y,,  y,}  from  the  sets  {yj,,  y^,,  yjj)  or 

{.v,i2- >'44.  >'42- 733)'  applying  ^1  to  A  leads  to  the  set  <1)|(A): 
{ft>i(0,),  $,(02)- ‘I’|(7i). ‘I’i(72)}-  However,  since  $,  is  the  iden¬ 
tity  mapping  (A  =  $,(A)),  condition  (4)  is  unaffected  by  the 
action  of  $,.  This  means  that  when  there  is  no  rearrangement 
of  the  parent  circuit,  the  original  circuit  will  remain  unaffected 
and  there  will  be  no  new  equivalent  circuit. 


cF,  *4),  =  (P;  =  e,  1  e  (l,2,3,--,8}  - 

$2  *  $3  =  $3  *  $2  ~  *^*4-  $5  *  $5  =  $6  *  $5  =  $4;  $2  *  '^3  ~  ‘I’3  *  ‘t’2  “  ‘^4- 

$3  *  $4  =  $4  »  $3  =  $2;  $6  *  $4  =  $4  *  $6  =  $5;  $3  *  $4  =  $4  »  $3  =  $2! 

4)4  «  cp,  =  (p,  *  <J)^  =  4)^;  4)^  *  4),  =  4)5  *  4)^  =  4)^;  4)j  *  4>2  =  4),  *  4)^  =  4)j,, 


It  is  easily  verified  that  sets  5,,  S2,  and  defined  as  5,: 
{$,,$2-‘t’3.^4}-  S.:  {$,.$4,  $5,  $6),  and  S^:  {$,.  $4,  $7,  $3} 
satisfy  all  the  conditions  of  the  Klein  group  presented  in  [14], 
Another  interesting  property  is  that  all  the  isomorphisms  are 
self  inverses.  Hence,  repeated  application  of  any  isomorphism 
on  A  will  be  equivalent  to  applying  the  identity  mapping  or  the 
original  mapping  itself,  depending  on  whether  the  number  of 
times  the  mapping  is  applied  is  odd  or  even,  respectively.  From  a 
circuit  theory  viewpoint,  this  implies  that  the  repeated  applica¬ 
tion  of  a  particular  admittance  transformation  will  not  lead  to 
more  than  one  additional  equivalent  circuit.  The  same  conclu¬ 
sion  can  be  made  by  considering  the  two-element  group,  i.e.,  the 
identity  element  and  any  other  given  isomorphism  from  set  S. 

In  order  to  construct  the  necessary  non-commutative  group, 
we  have  to  note  the  following  additional  properties  exhibited  by 
the  elements  of  set  S: 

4)j  *  4)^  =  4)^  *  4)2  =  $2  *  ‘I’s  =  ‘I’s  *  ^3  =  ‘t’s 

4)3  *  4)j  =  4)j  *  4),  =  4)2  *  $6  =  $6  *  ‘1*3  =  ‘*’7 

$5  *  $7  =  $7  *  =  $6  *  $8  =  $g  *  $5  =  $2 

$7  *  =  $8  *  $7  =  $8  *  4*8  =  $5  *  $8  =  $3 

$7  *  $-  =  $3  *  $7  =  <P-i  *  $8  “  'J’s  *  ~  ^5 

4)^  *  4)7  =  $7  *  $3  =  $8  »  $2  =  *  ‘t’s  =  't’s- 


2)  Every  Element  of  Set  S  is  a  Self  Inverse:  When  i  #  1,  $,,  for 
i  =  2,  3,---,8  will  be  a  nontrivial  isomorphism  from  (5,  *),  where 
*  is  the  composition  operation.  The  action  of  $,  on  A  leads  to 
reordering  of  the  elements  of  A  and  leave  condition  (4)  fixed. 
Since  A  contains  the  two-port  admittances  of  the  parent  circuit, 
any  reordering  of  A  leads  to  the  reordering  of  the  admittances 

and  results  in  an  additional  equivalent  circuit.  This  means  that 
given  a  parent  oscillator  circuit,  the  action  of  a  nontrivial  $, 
readily  synthesizes  an  additional  new  oscillator  (notch-filter) 
circuit  with  the  same  operating  frequency.  However,  if  $,  is 
applied  to  the  new  set  $,(A),  the  resulting  set  is  $,*($,(  A)) 
which  can  be  rewritten  as  $,^(A)  =  $,(A)  =  A.  Qearly  the  appli¬ 
cation  of  $,  again  will  result  in  the  set  $,(A).  This  leads  to  the 
conclusion  that  a  given  nontrivial  isomorphism  leads  to  a  unique 
equivalent  circuit.  Hence,  the  seven  nontrivial  isomorphisms 
lead  to  seven  nontrivial  equivalent  circuits.  This  is  not  sufficient 
to  conclude  that  the  upper  bound  on  the  additional  equivalent 
oscillator  or  notch-filter  circuits  is  seven.  The  next  observation 
leads  to  the  sufficient  part  of  the  upper  bound. 

3)  Set  S  Forms  a  Non-Commutative  Finite  Group:  It  was  noted 
that  the  elements  of  set  S  form  a  non-commutative  group.  Set  S 
can  be  written  as  the  union  of  subsets  5,,  S2  and  53  which  form 
three  different  Klein  groups.  Hence,  the  total  number  of  ele¬ 
ments  of  the  group  S  can  be  easily  computed  using  the  set 
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theoretic  concepts  as 

15,1  +  15, i  +  15,1  -  i5,  U  5,1  -  15,  U  5,1 

-  15,  U  5,1  +  15,  n  5,  n  53I 

=  4  +  44-4-2-2-2+1  =  7. 

This  is  much  easier  than  going  though  an  exhaustive  search  of 
all  the  possible  combinations  of  the  elements  of  5.  It  can  be 
verified  that  there  are  120  different  ways  in  which  these  ele¬ 
ments  can  be  combined.  Since  some  of  them  are  non-commuta- 
tive  mappings,  it  is  possible  to  have  infinitely  many  different 
compositions  which  are  equivalent  to  one  of  the  eight  elements 
of  5.  Hence,  the  number  of  nontrivial  equivalent  circuits  is 
seven. 

4)  Subset  Sj  is  Complement  to  the  Ser  5,  U  5j  in  the  Sense  of 
Stability:  It  can  be  shown  that  if  the  parent  circuit  is  a  stable 
circuit,  subgroup  5,  will  generate  all  stable  circuits  and  sub¬ 
groups  Sj  and  5,  will  generate  some  unstable  circuits.  Since  5 
has  eight  elements  and  the  subgroup  5,  has  four  elements,  the 
number  of  stable  circuits  from  a  stable  parent  circuit  will  be 
four.  This  result  holds  for  an  oscillator  and  a  notch-filter  alike.  A 
restricted  version  of  this  result  for  the  case  of  a  stable  parent 
oscillator  is  available  in  [7].  Hence,  even  for  a  stable  parent 
circuit,  the  results  derived  here  are  more  general.  However,  the 
results  gain  additional  strength  from  the  proof  that  the  present 
method  can  produce  the  identical  number  of  stable  circuits 
(notch-filters  and  oscillators)  starting  from  an  unstable  parent 
circuit.  This  result  is  new  and  provides  a  tight  upper  boimd  on 
the  stable  circuits  that  can  be  synthesized  from  a  stable  or  an 
unstable  parent  circuit.  In  the  restricted  case  of  oscillators,  given 
an  unstable  parent  circuit,  results  in  [7]  will  produce  only  unsta¬ 
ble  circuits.  Hence  the  result  in  [7]  needs  an  additional  condition 
that  the  parent  circuit  should  be  stable.  Stability  is  not  an  issue 
in  our  procedure,  as  explained  above.  The  next  subsection  shows 
how  the  non-commutatrve  group  results  generalize  the  earlier 
reported  results. 

V.  Interpretation  of  the  Earlier  Results  From 
Non-Commutative  Framework 

Most  of  the  earlier  results  are  applicable  to  either  an  oscilla¬ 
tor  or  a  notch-filter.  Since  the  non-commutative  result  is  applica¬ 
ble  to  both  cases  it  is  of  interest  to  find  out  how  well  the 
established  synthesis  techniques  relate  to  the  present  theory. 
Decomposition  of  set  5  into  the  commutative  subgroups  Sj,  S2, 
and  5,  becomes  useful  in  this  discussion. 

In  [7]  it  was  shown  that  given  a  parent  circuit  employing  an 
ideal  OA,  it  is  possible  to  generate  three  additional  equivalent 
oscillators  with  the  same  operating  frequency.  The  key  assump¬ 
tion  not  explicitly  mentioned  in  [7]  is  that  the  parent  oscillator 
circuit  needs  to  be  stable.  Hence,  this  method  forces  the  inven¬ 


tor  to  find  at  least  one  stable  circuit  by  trial  and  error  method. 
Given  an  unstable  circuit,  the  method  in  [7]  will  generate  only 
unstable  circuits  because  it  preserves  the  transfer  function  form. 
It  can  be  shown  that  the  method  in  [7]  is  identical  to  using 
subgroup  5,  for  a  given  stable  parent  circuit.  Since  the  details 
will  unduly  extend  the  length  of  the  paper  we  omit  the  proofs: 
they  will  be  presented  elsewhere. 

Results  in  [8]  can  be  obtained  by  choosing  only  5,  and  5,  to 
construct  additional  equivalent  circuits  from  the  knowledge  of  a 
parent  circuit.  The  upper  bound  on  the  additional  equivalent 
oscillators  was  shown  to  be  five  in  [8],  and  the  nontrivial  isomor¬ 
phisms  for  sets  5,  and  5,  taken  together  were  shown  to  be  five 
in  our  earlier  paper  [14].  Since  the  results  presented  in  [14]  form 
a  subset  of  the  results  in  this  paper,  we  conclude  that  taking  any 
two  subgroups  from  5,,  5,,  and  S^  leads  to  the  uppter  bounds 
derived  in  [8].  The  group  isomorphic  diagrams  are  presented  in 
Fig.  2. 

VI.  Illustrative  Example 

In  this  section  we  illustrate  the  theory  developed  in  this  paper 
and  the  claim  that  the  non-commutative  group  method  will 
enable  one  to  synthesize  the  stable  as  well  as  unstable  circuits 
for  a  given  parent  circuit.  Since  there  are  well-established  results 
available  for  the  case  of  oscillators  we  illustrate  the  theory  for 
oscillators. 

Assume  that  we  are  given  the  circuit  in  Fig.  3{a).  In  terms  of 
the  two-port  admittances,  the  necessary  constraint  equation  can 
be  derived  as 

yi*yj-y2*yA  =  O- 

At  this  stage  there  are  two  choices  for  the  admittances,  as 
given  below: 

1) 

5C4  1  1 

2) 

SC,  ■  ‘  l.  v,  •;v:;  ,J 

“  SC,R,  +  1’^^  “  “  i?3  and  y,  -  -  +  5C4. 

Though  both  of  these  chokes  lead  to  the  necessary  conditions, 
only  one  choice  leads  to  a  stable  circuit  In  general,  stability 
cannot  be  predetermined.  Circuits  have  to  be  generated  and 
then  using  simulation  methods,  the  stable  circuits  are  chosen. 
Let  us  consider  the  first  choice  of  parameters.  The  correspond¬ 
ing  circuit  is  shown  in  Fig.  3(a).  Letting  d,  =  y,,  =  y3,  7,  =  y2i 

and  72  “  yi<  we  can  construct  the  necessary  non+ximmutative 
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Fig.  3.  Unstable  Wien-Bridge  oscillator  family. 


Fig.  4.  Stable  Wien-Bridge  oscillator  family. 


group  S.  Elements  of  S  are  defined  as  follows: 


^2  Tl 
^2  Tl 


yi 

72 


Hence,  applying  the  results  in  [7]  to  the  circuit  in  Fig.  4(a)  will 
lead  to  the  remaining  circuits  in  Fig.  4. 

A  vast  number  of  oscillator  circuits  are  reported  in  [8].  These 
circuits  can  be  generated  and  grouped  using  the  theory  pre¬ 
sented  here. 


■>'2  1  ^2  Tl  72  \ 

^'/’^•l^2  72  r.j 


<1)5 : 


^2 

72 


7i 


72  ]  ^2  7i  72] 

^2  I  ^  (  72  7l  ^2  ^1  j 


1^1  ^2  7i  ■>'2  ] .  (t,  .  /  ^2  7l  72  \ 

\7l  72  &2  ^1  /’  \72  7i  ^1  ^2  / 


Furthermore,  sets  5,,  S2,  and  Sj  defined  as 


5,: 


{<!>,,  <h2,<t)3,<h4},  5.:  {<Di,<D„<t.5,<Dg},  and  Sy. 
satisfy  all  the  conditions  of  a  Klein  group  [14], 

Action  of  (hj  on  the  circuit  shown  in  Fig.  3(a)  leads  to  the 
circuit  shown  in  Fig.  3(b).  Action  of  on  the  circuits  in  Figs, 
3(a)  and  (b)  leads  to  the  circuits  shown  in  Figs.  3(c)  and  (d), 
respectively.  It  can  be  verified  that  the  elements  of  subgroups  5j 
do  not  generate  any  additional  circuits.  We  now  show  that  the 
results  in  [7]  will  generate  identical  circuits.  Applying  theorem  1 
in  [7]  to  the  circuit  in  Fig.  3(a),  we  can  generate  the  circuit  in 
Fig.  3(b).  Applying  theorem  2  in  [7]  to  circuits  in  Figs.  3(a)  and 
(b)  leads  to  the  circuits  in  Figs.  3(c)  and  (d),  respectively.  Since 
the  method  in  [7]  preserves  the  transfer  function,  it  cannot 
generate  any  additional  circuits. 


However,  the  non-commutative  group  theoretic  method  can 
generate  additional  circuits  using  subgroups  Sj  and  Sy  Action 
‘^’5>  ‘I’s-  *^7-  'J’g  on  the  circuit  in  Fig.  3(a)  generates  the 
circuits  in  Figs.  4(a),  (b),  (c),  and  (d),  respectively.  The  circuits  in 
Fig.  4  are  all  stable  members  of  the  well-known  Wien  Bridge 
oscillator  family.  We  also  note  that  applying  subgroup  to  the 
circuit  in  Fig.  4(a)  leads  to  the  remaining  circuits  in  Fig.  4. 
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APPENDIX  J1 

B.  Sahiner  and  A.E.  Yagle,  “Image  Reconstruction  from  Projections  Under 
Wavelet  Constraints,”  to  appear  in  IEEE  Trans.  Sig.  Proc.  41(12),  December 
1993  (special  issue  on  wavelets). 

This  paper  considers  the  problem  of  image  reconstruction  from  projections,  given 
constraints  not  on  the  image,  but  on  certain  wavelet  coefficients  of  the  image.  The  idea 
is  that  low-resolution  regions  of  the  image  can  be  locally  low-pass  filtered  by  setting  high- 
resolution  wavelet  coefficients  to  zero.  These  are  then  used  as  constraints  on  the  image 
reconstruction  process,  so  that  other  areas  of  the  reconstructed  image  are  improved  as 
well.  The  constraints  axe  implemented  as  a  simple  filter  directly  on  the  image. 
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Huffmann  coding  rate  (bpp) 

Peak  SNR  vereus  Huffmann  bit  rate,  for  different  lengths  Z,  =  4. 
and  K  L/2  2.  The  initial  image  was  BARBARA. 
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I.  Introduction 

I"  many  problems  arising  infields  such  as  medical  imaging  non- 

destnictive  testing,  radio  astronomy,  and  geophysics  [1],  one  needs 
to  reconstruct  a  two-dimensional  object  or  image  frim  its  projec- 
tons,  which  amounts  to  computing  the  inverse  Radon  transform 
A  problem  with  the  inverse  Radon  transform  is  that  the  ramp  filter 
amplifies  the  high-frequency  components  of  both  the  noise  and  the 
data.  Since  noise  usually  dominates  at  high  frequencies,  it  is  com¬ 
mon  practice  to  use  a  low-pass  filter  in  conjunction  with  the  ramp 
filter  to  improve  the  signal-to-noise  ratio  (SNR).  However,  the  SNR 
improvement  obtained  by  using  a  low-pass  filter  comes  at  the  ex¬ 
pense  of  degraded  image  resolution,  since  high-resolution  features 
in  the  image  will  also  be  smoothed.  It  is  desirable  to  reduce  the 
noise  energy  in  the  reconstructed  image  over  regions  where  high- 
resolution  features  are  not  present,  by  using  spatially-varying  fil- 

In  this  note,  we  use  wavelets  to  perform  this  desired  localized 
low-pass  filtenng.  We  show  how  thresholding  can  be  used  to  de¬ 
termine  the  regions  in  the  wavelet  domain  where  wavelet  coeffi¬ 
cients  may  be  set  to  zero,  effecting  spatially-varying  filtering,  and 
provide  a  statistical  justification  for  it.  Alternatively,  a  priori  in¬ 
formation  about  the  image  can  be  used  to  identify  such  regions 
We  then  use  these  zero  wavelet  coeflScients  as  constraints,  and 
compute  the  rmnimum  mean-squared  error  image  which  satisfies 
these  constraints. 


n.  The  Radon  and  Wavelet  Tkansporms 
A.  Image  JReconstruction  From  Projections 

The  i^erse  Radon  transform  problem  is  to  reconstruct  an  image 
pKx,  y)  from  its  projections  p(r,  9)  where 


p{r,  9)  =  J,)} 


~  1—  -  ysintfjdrdj-  (1) 

IS  the  Radon  transform  of  p{x,  y).  A  common  procedure  for  ob¬ 
taining  p{x,  y)  from  p{r,  9)  a  filtered  backprojection  (FBP)  in 
which  the  projections  are  first  filtered  to  yield  s(r,  9)  ^  p{r,  e)  • 
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q(r).  where  the  Fourier  transform  of  the  ramp  filter  q(r)  approxi¬ 
mates  iw|.  The  image  q(x.  y)  is  then  obtained  by  backprojecting 
sir.  d).  In  practical  problems,  we  have  only  samples  of  p(r.  d),  in 
which  case  samples  of  the  image  ^ri’.  j)  are  obtained  by  a  discre¬ 
tization  of  the  continuous  FBP  method 

^i(i.  j)  =  !  p(m.  t!)} 

V  -  i  »  « 

=  —  S  2  q{l  -  m) 

2.V  7  =  0  77  =  -aa  i  =  —  X 

■  p(m,  n)h{i  cos  nA  +  j  sin  nA  —  t)  (2) 

where  /V  is  the  total  number  of  views,  A  -  ir / N .  p(m.  n)  is  the 
mth  projection  in  the  nth  view,  h(x)  is  an  interpolation  function 
(e.g.,  for  linear  interpolation (x)  =  1  -  ljr|  for  |.t|  <  1  and  A(jt) 

=  0  for  1a|  >  1),  and  (R,7'  is  the  discrete  version  of  the  inverse 
Radon  transform  operator. 

B.  The  Wavelet  Transform 

The  wavelet  transform  can  be  viewed  as  a  time-frequency  rep¬ 
resentation  that  has  good  localizing  properties.  In  this  correspon¬ 
dence,  we  restrict  attention  to  orthogonal,  separable  2-D  wavelet 
transforms  [2],  We  assume  that  the  reader  is  familiar  with  the 
wavelet  transform  theory  and  we  only  establish-  the  notation  to  be 
used. 

The  2-D  wavelet  transform  W^flx,  y)  of  a  2-D.  square-inte- 
grable  function /(x,  y)  is  defined  as 

W'yfin,  «)  =  j  ^ 

•  -  Z|,  2^m  -  zfi  dzx  dzi  (3) 

where  y)  =  2-‘V{2->x,  2-‘y),  and  ^'(x,  y),  1  <  /  £  3  are 
the  orthonormal  subwavelets  with  different  orientations  in  the  2-D 
plane  for  different  i. 

For  a  discrete  2-D  function  fo(n,  m),  the  wavelet  transform  is 
defined  as  follows.  First,  the  blurred  signal  at  scale  j  is  defined 
recursively  as 

f,(n,  m)  =  2  2  h{2n  —  k{)hf2m  —  kf)  f  -  ^2)  W 

and  then  the  wavelet  transform  at  scale  j  is  defined  as 

W'sjfln,  ot)  =  2  2  g‘{2n  —  k\,  2m  —  kf)fi-\(ki,  k2), 

ki  *2 

i  =  1,  2,  3  (5) 

where  g‘(n,m)  =  g(n)h(m),  g^(n,  m)  =  g(m)h(n),  and  g^n,  m) 
=  g(n)g(m).  For  a  detailed  description  of  properties  satisfied  by 
the  wavelets  i/'i;(x,  y)  and  the  filters  g(n)  and  h(n),  see  [2],  [3]. 


III.  Multiscale  Filtering 

The  idea  of  filtering  a  signal  in  time-frequency  space  has  been 
discussed  and  employed  for  the  Wigner  distribution  and  short-time 
Fourier  transform  in  [4]-[6].  In  this  note,  we  regard  the  wavelet 
transform  as  a  time  frequency  representation  and  perform  a  simple 
version  of  multiscale  filtering  by  setting  the  wavelet  coefficients  of 
the  signal  to  zero  in  regions  of  the  wavelet  transform  domain  where 
the  signal  energy  is  known  or  estimated  to  be  much  smaller  than 
the  noise  energy.  Since  fine-scale  wavelet  transform  components 
represent  localized  high-resolution  features  of  the  image,  window¬ 
ing  these  to  zero  effectively  smoothes  the  image,  much  as  low-pass 
filtering  does.  The  advantage  of  using  wavelets  is  that  this  can  be 


done  on  a  localized  basis,  smoothing  in  some  areas  wh.ie  leavmg 

other  areas  (such  as  edges)  unaffected. 

In  image  reconstruction,  there  is  often  considerable  a  pnon 
knowledge  about  the  image,  and  the  region  m  the  wavelet  domain 
where  we  set  the  wavelet  coefficients  to  zero  can  be  determined 
from  this  knowledge.  For  example,  it  may  be  known,  u  pnon.  that 
there  are  no  high-resolution  features  in  some  region  D  of  the  image. 

D  may  be  known  to  represent  a  flat  or  slowly-varying  part  of  the 
image,  or  D  may  be  known  to  be  free  of  edges.  Since  fine-scale 
wavelet  transform  components  represent  localized  high-resolution 
features  of  the  image,  we  can  window  these  to  zero  in  the  region 
D.  and  cancel  some  of  the  noise  energy  without  degrading  the  im- 
age. 

If  there  is  no  a  priori  knowledge  of  image  features,  a  threshold¬ 
ing  approach  may  be  used  to  set  some  wavelet  coefficients  to  zero. 
The  idea  is  to  eliminate  noise  where  it  is  possible  to  do  so  without 
significantly  degrading  the  image.  Since  the  wavelet  transform  is 
being  used,  this  noise  filtering  can  be  done  on  a  localized  basis. 

The  ideas  of  using  thresholding  on  a  time-frequency  represen¬ 
tation  and  setting  certain  regions  of  a  time-frequency  representa¬ 
tion  to  zero  for  time-varying  filtering  have  been  mentioned  and  used 
in  [5]-[9].  Here,  we  supply  a  statistical  justification  for  the  thresh¬ 
olding  approach  as  we  use  it  in  this  correspondence. 

A.  Detection  Problem  Formulation 

Assume  that /,(x,  y)  is  a  zero-mean  white  Gaussian  random  field 
with  power  spectral  density  a^,  and  let  {^2i(.2^n  -  x.  2-'m  -  y), 
i  =  1,  2,  3,  (j,  n,  m)  e  Z’}  be  an  orthonormal  wavelet  family.  Let 
W‘2,f,(n,  m)  be  the  discrete  wavelet  transform  off^lx,  y).  defined 
using  (3).  Then,  by  the  orthononnality  of  the  wavelet  family,  the 
quadruply  indexed  random  sequence  Wiifoltt,  t?)  is  uncorrelated 
and  zero-mean  Gaussian,  with  variance  aj,.  To  obtain  a  random 
field /(x,  y)  ^  fo(.x,  y)  whose  wavelet  coefficients  are  zero  with 
probability  one  outside  a  region  Dx ,  we  define 

fix,  y)  =  2  W^yfoln,  mH'2>(2‘n  -  X,  2^ m  -  y).  (6) 

y.rt.m.ieDi 

We  now  state  the  problem.  Given  the  noisy  observations 

f„(x,y)  =  fix,y)  +  n{x,y)  (7) 

of  fix,  y),  where  17  (x,  y)  is  a  zero-mean  white  Gaussian  noise  field 
with  power  spectral  density  (f\,  determine  the  region  D| . 

B.  Detection  Problem  Solution 

The  wavelet  transform  of/,(x,  y)  is 
W‘2,f„in,  m) 

W‘2,fM’ m)  +  W'2,nin<  ">)  ^ 

W2jri(n,  m)  otherwise, 

where  W'2^7f(n,  m)  is  the  wavelet  transform  of  17  (-t.  .v)-  Note  that 
W'2jf  in,  m)  is  a  zero-mean  uncorrelated  Gaussian  random  se¬ 
quence  whose  variance  is  ffj  -F  0%  for  (/,  j,  n,  m)  inside  O,  and 
for  (i,  J,  n,  m)  outside  D,.  Therefore,  the  decision  of  whether  a 
point  (i,  J,  m,  n)  s  D,  decouples  from  similar  decisions  for  other 
points.  The  solution  of  the  problem  is  the  test  [10] 

eOi 

lW'2,f^in,  m)l  S  P  •  |9) 

eOl 

where  R  is  a  threshold,  determined  using,  e.g..  a  Ney man -Pearson 
criterion.  This  means  that  we  can  decide  whether  (i.  j.  n.  mi  is  m 
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D|  simply  by  thresholding  the  absolute  value  of  the  wavelet  trans¬ 
form  of  the  noisy  image  at  that  point.  When  the  whiteness  or  or- 
thonormality  assumptions  are  relaxed,  the  threshold  test  described 
above  is  no  longer  guaranteed  to  be  optimal.  However,  the  thresh¬ 
old  test  still  seems  to  be  the  “natural”  approach. 

IV.  Image  Reconstruction  under  Wavelet  Constraints 

For  clarity  of  presentation,  we  first  define  and  solve  the  problem 
of  image  reconstruction  under  wavelet  constraints  at  a  single  scale 
and  on  a  single  subwavelet.  We  then  generalize  to  multiple  scales 
and  several  subwavelets. 

A.  Constrained  Wavelet  Coefficients  on  a  Single  Scale 

Suppose  that  we  are  given  noisy  observations  p,(m,  n)  =  p(m, 
n)  +  rj(m,  n)  of  the  projections  p(w,  n),  where  n)  is  a  zero- 
mean  Gaussian  noise  field  which  is  uncorrelated  in  the  angular 
variable  and  correlated  in  the  radial  variable  with  autocorrelation 
R^ik)  =  n)ri{m  +  k,  n)].  Suppose  also  that  the  wavelet 

coefficients  of  the  actual  image  PaUJ)  at  translations  {(/V,  I 
S  c  <  C}  on  the  finest  scale  2',  and  with  respect  to  the  first  sub¬ 
wavelet  \k'  (x,  y)  are  known  or  estimated  (see  Section  III)  to  be  zero 


where  { 1  5  c  S  C}  are  computed  by  solving  the  C  x  C  linear 
system  of  equations 

I0u  ■  ■  ■  ,  •  •  •  ,  fk'2ifi,(ic.Jc)]  (14) 

and  the  (u,  t;)th  entry  of  the  C  x  C  matrix  M  is  ,  (2(i,.  - 

O.  2(y„  -JJ). 

Finally,  the  constrained  reconstructed  image  is  /l(i.  j)  =  n 
{i,j)-e(i.j). 

B.  Constrained  Wavelet  Coefficients  on  Multiple  Scales 

We  now  generalize  the  results  of  the  previous  subsection  to  mul¬ 
tiple  scales  and  more  than  one  subwavelet.  For  notational  simplic¬ 
ity,  we  assume  that  two  subwavelets  are  used;  generalization  to 
more  than  two  sub  wavelets  should  be  apparent. 

Assume  that  the  wavelet  transform  of  p,(i,  j)  is  known  to  be 
zero  at  various  points  on  L  different  scales.  We  index  each  of  these 
points  by  an  integer  C|  (/)  or  Ciil)  where  /  denotes  the  scale  and 
the  subscript  denotes  the  subwavelet  used.  Similarly  to  (10),  these 
constraints  can  be  written  as 


^2'Ma(<c.7r)  =  ^  ^  f‘'a(i>J)g'(2ic  -  1,  2j^  -  j)  =  0, 

1  S  c  s  C.  (10) 

Our  goal  is  to  compute  the  image  ii(i,j)  such  that 

1)  M(t.  y  )  satisfies  the  wavelet  constraints;  and 

2)  £{E,  Ey  (Ma(<. y)  -  is  minimized. 

Let  /!)},  and e(i,j)  =  (R/'  {i/lm,  n)}  so 

th^t  11,(1,  j)  =  PaU’j)  +  f(i,j)-  To  solve  the  problem,  we  first 
estimate  «(i,y')  using  the  given  constraints  and  then  we  subtract  the 
noise  estimates  from  the  noisy  image  fi,(i,  j).  Since  j)(m,  n)  is 
zero-mean  Gaussian,  «(/.  y)  is  also  zero-mean  Gaussian,  and  the 
solution  is  the  linear  minimum  mean  square  (LMMSE)  estimate. 
The  constraints  (10)  can  be  written  as  { W], « (/,,  y,.)  = 
yj,  1  s  c  s  C}.  The  goal  is  to  compute  the  MMSE  of  e(i,  j) 
subject  to  these  constraints. 

Let  R,(i,j)  be  the  autocovariance  of  « (i,y),  let  (2 (r  - 

*),  2(y  -  /))  =  E[W\,e(i,j)W\,e(k,  /)],  and  let ,,,  (2/  -  k, 

~  0  ~  ^[^2'*{Ly  )e(k,  /)].  Using  a  discime  variable  extension 
of  the  results  of  [11],  it  can  easily  be  shown  that 

R,  a,  j)  =  (R7'  {  q(m)  *  R,(m)}.  (11) 

The  functions  (2 (i  -  k),  2(y  -  /))  and  Riw[„u.)(2i  - 

k,  2j  —  1)  are  computed  as 

a'l,. (2(1  -  k),  2(y  -  /))  =  L  2  2  S  R,(m,  n) 

m  n  s  t 

•  g'(s,  t)g'(s  +  2(i  -  k)  -  m, 

f  +  2(j  -  1)  -  n) 

^(»'J,«).(o(2f  -  k,  2y  -  ()  =  2  2  R,(m,  n) 

m  H 

■  g'(2i  -  k  -  m,  2j  -  I  -  n). 

(12) 

The  LMMSE  estimate  of  e(i,y)  is 

c-C 

e(/,y)  =  2^  -  i,  2y,  -y)  (13) 


I  S  C|(/)  IS  C,(/),  1  <  /  s  L 

^2‘((‘c(l),  jnd))  =  l*'2'/S(io(/),yn(/)), 

1  S  C2(/)  s  c^il),  I  ^  1  s  L 


.  (15) 

where  the  superscripts  denote  the  subwavelet  being  used. 

Having  established  this  noution,  the  arguments  used  in  Section 
IV- A  lead  to  the  formula 

L  CiU) 

i(ij)  =  ^2  ^^^!2  ^  ^a(i)R{w[, , ).it)(2‘ i„^|y  -  I,  2'yc,(/)  -  j) 

L  CM) 

/?i  ^o<')^'*'L«).(.)(2'io(/)  -  (.  2'jc^i)  -  y  ) 


(16) 

where  the  are  again  computed  by  solving  a  linear  system  of  equa¬ 
tions.  The  right-hand  side  of  the  system  is  a  vector  of  the  known 
values  of  ^2' l^(ica),  JcU))  which  are  again 

the  wavelet  coefficients  known  to  be  zero  in  the  noiseless  image. 
The  system  matrix  M  consists  of  £,*  submatrices  Af,  each  of  which 
contains  the  cross-covariance  between  wavelet  coefficients  of  the 
noise  at  scales  /  and  m.  Af  and  Af, have  the  forms 


A/|  1  A/|  2  *  *  ^i,L 

Af  = 

^2.1  ^2,2  ■  ■  Afj  z 

_Aft,i  A#l2  •  •  A#n,_ 

^l.m  = 

'Af,..(l.  1)  A#,..(l,  2)' 
.Af,,.(2,  1)  A#,.„(2,2), 

(17) 


where  A/;  „(1,  1)  contains  the  cross-covariance  of  noise  wavelet 
coefficients  computed  with  respect  to  the  first  subwavelet.  The 
(c,  (/),  c,  (m))th  cntiy  of  A/,, .(I,  1)  is  (2' i„,„  -  2" 

2‘jcm  -  Entries  of  A/,..(l,  2),  and  Af,.,.(2,  2)  are  defined 

similarly;  the  cross-covariance  between  subwavelets  1  and  2  are 
used  for  Af;  „(1,  2),  and  cross-covariances  computed  with  respect 
to  the  second  subwavelet  is  used  for  Af,  „(2,  2).  Again,  the  recon¬ 
structed  image  is  /i(i,y  )  =  M,('.y)  -  e(i,y). 
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Fig.  1.  The  reconstruction  of  the  disk  used  in  Example  1  from  noiseless 
projections. 


Fig.  3.  The  noisy  image  for  Example  1,  obtained  from  the  noisy  projec¬ 
tions  by  FBP. 


Rlltl 


We  present  four  numerical  examples  which  illustrate  the  above 
procedures. 

Example  1:  The  noiseless  image  used  in  this  example  is  a  disk 
of  value  1.00  and  radius  0.81  in  a  background  of  value  0.  Projec¬ 
tions  of  the  disk  are  computed  over  128  angles  and  128  lines  in 
each  angle.  The  reconstruction  from  noiseless  projections  is  shown 
in  Fig.  1 .  The  noise  added  to  the  projections  is  obtained  by  passing 
zero-mean  white  Gaussian  noise  with  variance  0.01  through  a  filter 
whose  discrete-time  Fourier  transform  is  (sin  (w))^^.  The  autoco¬ 
variance  of  the  noise  is  shown  in  Fig.  2,  and  the  100  x  100  image 
obtained  from  the  noisy  projections  using  FBP  is  shown  in  Fig.  3. 

The  wavelets  we  use  are  two  subwavelets  of  the  Haar  basis, 
which  can  be  regarded  as  difference  operators  in  the  x  and  y  direc¬ 
tions  (the  third  Haar  subwavelet,  which  can  be  regarded  as  a  dif¬ 
ference  operator  in  the  diagonal  direction,  is  not  used).  We  con¬ 
strain  the  two  finest-scale  wavelet  coefficients  to  be  zero  in  a  15  x 
55  rectangular  area  Rq  inside  the  disk.  We  use  (16)  to  estimate  the 
noise  « (i, ;'),  where  the  matrix  M  is  given  by  (17). 

The  MMSE  image  is  shown  in  Fig.  4.  The  first  row  in  Table  I 
discusses  the  average  performance  of  our  procedure  for  10  different 
noise  realizations  for  this  example.  The  area  obtained  by  enlarging 
/?o  by  20  pixels  in  every  direction  is  denoted  as  ^i;  this  is  roughly 
the  area  in  which  we  expect  improvement,  due  to  the  support  of 
the  filter.  The  whole  image  is  denoted  by  R,.  From  Table  I,  we  see 


TABLE  I 


Noise  Power  ill  '  liiiproveiiif lit 


Ro 

fit  1 

Rt 

-  ft)  !  /^I  -  ft3 

Noisy  Image 

5.20  i 

:  28.48 

53.35 

Example  1 

0,082  ' 

20.30 

45.15 

13.15  :  6.40 

Example  2 

2.475 

24.48 

49.78 

M-  i  17-, 

Fig.  4.  The  MMSE  image  obtained  by  constraining  the 
wavelet  coefficients  in  ^  to  0. 


that  the  noise  in  Ro  is  almost  completely  eliminated  We  aNo  nnd 
that,  compared  to  the  unprocessed  noisy  image,  the  noiM.-  powers 
in  the  regions  Ri  -  Rq  and  R,  -  Ro,  in  which  we  do  no,  ha^r  ./nv 
wavelet  constraints,  are  reduced  by  13.1  and  6.4“^ .  n-'pe^.ti'cly 
This  shows  that  constraining  wavelet  coefficients  in  a  gi'en  region 
improves  the  reconstruction  in  other  regions.  This  l^  bx.N.juNe  the 
noise  e  in  the  reconstracted  image  is  nonwhite,  due  to  ihe  tau  that 
the  Radon  transform  is  nonunitary  and  the  additive  noise  -i  n  the 
projections  is  nonwhite. 

Example  2:  In  the  second  example,  the  original  image  o  not  flat 
in  Ro,  but  varies  smoothly  in  that  region.  The  noisele..  ,mage. 
which  is  the  union  of  a  disk  and  an  exponential,  o  'h.  *  n  t  Fig 
5.  The  additive  noise  is  the  same  as  in  Example  1  I  'C  ■'i  looy 
reconstmetion  is  shown  in  Fig.  6.  The  wavelet  conorj  m-  ,  J  are 
the  same  as  in  Example  1. 
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The  resulting  image  is  shown  in  Fig.  7.  From  Table  I,  we  see 
I  that  the  noise  in  /Jq  is  not  completely  eliminated.  This  is  mostly 
j  ^ause  we  are  using  the  Haar  wavelet,  which  is  not  smooth  and 
has  only  one  vanishing  moment.  The  noiseless  image  has  smaU, 
but  not  negligible,  high-resolution  componems,  which  results  in 
nonzero  fine-scale  wavelet  coefficients  when  the  Haar  basis  is  osed. 
This  suggests  using  a  different  wavelet,  which  we  now  do. 

Example  3:  In  the  third  example,  we  use  the  same  image  and 
noise  as  in  Example  2,  but  we  use  two  scales  of  the  six-coefficient 
Daubechies  wavelet  [3],  which  has  three  vanishing  moments  The 
resulting  MMSE  image  is  shown  in  Fig.  8.  The  noise  power  inside 
;  /io.  which  was  reduced  to  2.5  in  Example  2,  is  now  ftmher  reduced 
i  to  0.77  with  this  choice  of  a  smoother  wavelet, 
j  Ejcample  4:  In  this  example,  we  use  the  Shepp-Logan  head 
phantom  [12]  as  our  noiseless  image,  which  is  shown  in  Fig.  9. 
The  autocorrelation  of  the  noise  is  the  same  as  previous 
'  and  the  noise  variance  is  4  x  lO'*.  The  128  x  128  noisy  image 
is  shown  in  Fig.  10.  We  use  the  Haar  wavelet,  and  constiain  the 
j  two  finest-scale  wavelet  coefficients  to  be  zero  over  a  region  D  of 
i  the  image.  The  region  D  is  obtained  by  the  thresholding  approach 
over  a  region  D’  in  the  center  of  the  image.  First,  the  second-finest 
wavelet  coefficients  inside  D'  are  set  to  zero  whenever  their  abso¬ 
lute  value  is  below  a  threshold.  The  region  which  will  be  affected 
by  the  above  operation  is  called  D\  Then,  inside  £>’,  another 


Fig.  5.  The  reconstruction  of  the  image  used  in  Examples  2  and  3  from 
noiseless  projections. 


naar  Dasts  wavelet  coefflciqtitt  j 


Fig.  6.  The  noisy  image  for  Examples  2  and  3,  obtained  from  the  noisy 
projections  by  FBP. 


Fig.  9.  The  noiseless  Shepp-Logan  phantom. 
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Fig.  10.  The  noisy  image  for  Example  4,  obtained  from  the  noisy  projec¬ 
tions  of  the  Shepp-Logan  phantom. 


Fig.  11.  MMSE  image  obtained  by  constraining  the  wavelet  coefficients 
of  the  noisy  image;  Example  4. 


threshold  is  used  to  set  the  finest-scale  coefficients  to  zero.  The 
resulting  MMSE  image  is  shown  in  Fig.  11.  The  noise  power  has 
been  reduced  by  20.3%  in  the  whole  image,  while  still  preserving 

the  edges. 


References 


S  R  Deans,  The  Radon  Transform  and  Some  of  iis  Apphcaimn^ 
New  York:  Wiley,  1983.  ,  ,  , 

S  Mallat  “A  theory  for  multiresolution  signal  decompoM.;, in  .ae 

wavelet  representation,-  IEEE  Trans.  Pan.  Anal.  Machmr  Imeil  . 
vol.  PAMI-11,  pp- 674-693,  1989. 

1.  Daubechies,  “Orthonormal  bases  of  compactly  supported  «,ave- 
lets  "  IEEE  Trans.  Inform.  Theory,  vol.  36,  pp.  961-1005.  1990 
B.  E.  A.  Saleh  and  N.  S.  Subotic,  “Time-variant  filtenng  ot  signals 
in  the  mixed  time-frequency  domain,"  IEEE  Trans.  Acoust.  Speech. 
Signal  Processing,  vol.  ASSP-33,  pp.  1479-1487.  1985 
T  E  Koczwara  and  D.  L.  Jones,  “On  mask  selection  for  iime-vary- 
ing  filtering  using  the  Wigner  distribution,"  in  Proc.  ICASSP.  .\\. 
buquerque,  NM,  1990.  pp.  2487-2490. 

M  Bikdash  and  K.  B.  Yor,  “Linear  shift  varying  filtenng  of  nonsu- 
tionary  chirp  signals."  in  Proc.  ICASSP.  New  York,  1988.  pp  428- 
432. 

B.  Boashash  and  L.  B.  White,  “Instantaneous  frequency  estimation 
and  automatic  time-varying  filtering,"  in  Proc.  ICASSP.  .Albuquer¬ 
que,  NM.  1990,  pp.  1221-1224. 

[8]  J.  Jeong  and  W.  J,  Williams,  "Time-varying  filtenng  and  signal  syn¬ 
thesis."  in  Time-Frequency  Signal  Analysis.  B.  Boashash.  Ed  Mel¬ 
bourne:  Longman  and  Cheshire,  1991. 

[9]  G.  F.  Boudreaux-Bartels  and  T.  W.  Parks,  Time-varying  filtering 
and  signal  estimation  using  Wigner  distribution  synthesis  tech¬ 
niques,”  IEEE  Trans.  Acoust..  Speech,  Signal  Processing,  vol.  34. 

pp.  442-451,  1988.  j  ,  ■  -n. 

H.  L.  Van  Trees,  Detection,  Estimation,  and  Modulation  Theory. 

New  York:  Wiley,  1968.  «  m 

A.  K.  Jain  and  S.  Ansari,  “Radon  transform  theory  for  random  helds 
and  optimum  image  reconstniction  from  noisy  projections,  m  Proc. 
ICASSP.  San  Diego,  CA,  1984.  pp.  12A.7.1-12A.7.4.. 

I  A  Sheoo  and  B.  F.  Logan.  "The  Fourier  reconstruction  of  a  head 
IEEE  Trans.^cl.  Sci.,  vol.  NS-21,  pp.  21-42.-1974 


[1] 

[21 

[3] 

[4] 

[51 

[6] 

[7] 


[101 

[111 

[121 


Multiresolution  Representations  Using  the  Auto¬ 
correlation  Functions  of  Compactly  Supported 
Wavelets 

Naoki  Saito  and  Gregory  Beylkin 


AMraet-yft  propose  a  shffl-lnvartant  mnWresolution  represeola- 
tioo  of  signals  or  iniaies  ostet  dUattorai  and 
corrctaUMitaicUoiis ofcompeeUy  supported  wavelets.  AltbouM 
fliactioaf  do  not  form  an  orthononnal  baste,  their  properties  matei^ 
slfMl  and  image  analysis.  UnUke  wavelet-based 
mal  representtUons,  our  representation  has  1)  symmetric 
r^tloL,  2)  Shin-Invariance,  3)  associated  iterative 
-.I—,-,  and  4)  a  rimple  algorithm  for  finding  the  locations  of  ihe  molo- 

icale  edsfs  st  zeit^<rossiiigs*  . _ 

We  abo  develop  a  noniterative  metimd  for  reconstructia* 
from  their  iero<rossliigs  (and  slopes  at  these  xer^rossi^^  ^ 
representation.  This  method  reduces  the  reconstructioo  petMem  to  that 
of  solving  a  system  of  linear  algebrak  eqoations. 
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ON  THE  USE  OF  WAVELETS  IN  INVERTING  THE  RADON 
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Abstract 

In  this  paper,  vve  first  present  some  new  results  on  con¬ 
strained  image  reconstruction.  Given  constraints  on  pixel 
values  and  the  statistics  of  additive  noise  in  the  projec¬ 
tions,  we  show  how  to  compute  the  minimum  mean-square 
estimate  of  the  reconstructed  image.  Second,  we  present 
some  results  on  the  use  of  the  wavelet  transform  to  per¬ 
form  spatially- varying  filtering  of  the  image  and  show  how 
noise  can  be  suppressed  in  flat  areas  of  the  image.  Third, 
we  combine  the  previous  results  into  a  new  constrained  im¬ 
age  reconstruction  procedure,  in  which  image  constraints 
are  applied  in  the  wavelet  domain.  The  new  procedure  im¬ 
proves  the  reconstructed  image  not  only  in  locations  where 
wavelet  constraints  are  applied,  but  also  in  other  regions. 

I.  Introduction 

In  many  problems  arising  in  fields  such  as  medical  imaging, 
non-destructive  testing,  and  geophysics  [2],  one  needs  to  re¬ 
construct  a  two-dimensional  object  or  image  from  its  pro¬ 
jections,  which  amounts  to  computing  the  inverse  Radon 
transform.  The  inverse  Radon  transform  is  ill-conditioned; 
if  it  is  regularized  by  using  a  low-pass  filter,  then  high- 
resolution  features  in  the  image  are  smoothed  or  lost.  It  is 
desirable  to  reduce  the  noise  energy  in  the  reconstructed 
image  over  regions  where  high-resolution  features  are  not 
present,  by  using  spatially-varying  filtering. 

In  this  paper,  we  use  wavelets  to  perform  this  desired 
spatially- varying  filtering.  We  constrain  fine-scale  wavelet 
coeflBcients  to  zero  in  certain  regions  in  the  wavelet  domain, 
effecting  localized  low-pass  filtering.  To  develop  the  nec¬ 
essary  algorithm,  we  first  discuss  the  problem  of  comput¬ 
ing  the  minimum  mean-square  error  estimate  of  the  recon¬ 
structed  image  which  satisfies  given  constraints  on  some 
pixel  values.  This  leads  us  to  an  algorithm  which  operates 
directly  on  the  reconstructed  image.  We  then  modify  the 
results  of  this  algorithm  and  apply  it  to  the  more  realistic 
problem  of  constraining  the  wavelet  coefficients. 

II.  Constrained  Image  Reconstruction 
from  Projections 

Baste  Problem:  The  Radon  transform  inversion  problem 
is  the  basic  problem  of  image  reconstruction  from  projec¬ 
tions.  Let  n{x,y)  and  p{r,9)  denote  an  image  and  its 
Radon  transform,  (projections)  respectively.  The  most 

*This  work  was  supported  by  the  Office  of  Naval  Research  under 
grant  #N00014-90-J-1897 


common  procedure  for  obtaining  y)  from  p(r,  6]  u  fj| 
tered  backprojection  (FBP).  in  which  the  projections  are 
first  filtered  to  yield  s{r.d)  =  p(r.9)  +  q(r).  where  th 
Fourier  transform  of  the  filter  q(r)  approximates  |u].  The 
image  p(x.j/)  is  then  obtained  by  backprojecting  s(r  ffj 
In  practical  problems,  we  have  only  samples  of  p{r,9) 
which  case  samples  of  the  image  p{i,j)  are  obtained  bv  a 
discretization  of  the  continuous  FBP  method: 

..  A'  —  1  OC  CC 

=  ^;'{p("2,n)}  =  —  ^  Y 

n  =  0  m  =  — oo  /=-co 

q(l  —  Tn)p{m.  n)h{i  cos  nA  -f  j  sin  nA  —  /)  (jj 

where  N  is  the  total  number  of  views,  A  =  tt/N  ,  p{m,  n)  is 
the  m"*  projection  in  the  n"'  view,  h(x)  is  an  interpolation 
function  and  72 J*  is  the  discrete  version  of  the  Inverse 
Radon  transform  operator.  Note  that  7(  )  may  be  chosen 
to  filter  noise  in  the  projections;  however  it  will  also  smooth 
the  entire  reconstructed  image. 

Constrained  Minimum  Mean  Square  Estimate:  Suppose 
that  we  are  given  noisy  observations  p,,(m,  n)  =  p(m,  n)  + 
T]{m,  n)  of  the  projections  p(m,  n),  where  p(m,  n)  is  a  zero- 
mean  Gaussian  noise  field,  uncorrelated  between  views 
(i.e.,  in  the  angular  variable),  but  correlated  within  a  view 
with  autocorrelation  Rqim)  =  E[r}{k,n)ri{m  + k,n)].  Sup¬ 
pose  that  we  are  also  given  the  value  of  the  actual  image 
Pa{ic,jc)  at  C  points.  The  problem  addressed  in  this  sec¬ 
tion  is  to  find  the  image  p{i,j)  such  that; 

1.  ji{i,j)  satisfies  the  constrained  image  values 
{pa(fc)7c)  —  I  ^  ^  ^  U) ,  and 

2-  E;  is  minimized, 

To  solve  this  problem,  we  first  compute  the  noisy  image 
from  the  noisy  projections  p„{m,n).  Second,  we 
determine  the  noise  values  at  the  constrained  pixels  (icjc) 
by  subtracting  the  actual  pixel  values  Kc  from  the  noisy 
image  pixel  values  p.^{ic,jc)-  Third,  we  estimate  the  noise 
values  at  other  pixels  from  the  known  noise  values  at  the 
constrained  pixels.  Finally,  we  subtract  the  noise  estimates 
from  the  noisy  image 

Specifically,  let  e(i,j)  =  72^  '  {p(m,  n)}.  Then,  the  prob¬ 
lem  is  to  compute  the  minimum  mean-square  estimate 
HhJ)  given  the  known  values  jc)  =  PniieJe)  -  P^c, 
1  <  c  <  C}.  Since  p(m,n)  is  zero-mean  Gaussian,  e(i,j) 
is  also  zero  mean  Gaussian,  and  the  solution  is  the  linear 
minimum  mean  square  (LMMSE)  estimate. 

To  compute  the  LMMSE,  we  first  need  the  autocovariance 
of  the  noise  It  can  be  shown  [1]  that  the 

autocovariance  of  the  noise  in  the  reconstructed  image  is 
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Rd^J)  -  E{eii\j')({i+i\j+j')}  ~  ’  {g(m)  + 

(2) 


The  LMMSE  ((i.  j)  of  is  found  as  [1] 


c 

^  -  Ic^J  -  Jc).  (3) 

r=l 


The  continuous  wavelet  transform  (5)  is  redundant  and 
can  be  sampled  to  yield  the  discrete  wavelet  transform, 
defined  as 


ir2,/(u,77l)  =  =  {^) 


IV.  Multiscale  Filtering 


^vhere  the  3r  are  computed  by  solving  the  C  x  C  linear 
jvstem  of  equations 

[Ji . Ji  )-A'i . ^r,(ic,jc)-Kc]  (4) 

and  the  [u.  element  of  the  matrix  M  is  RdR  — 

j^  ).  The  final  LMMSE  il(i,  j)  of  the  image  is  then  /l(i,  j)  = 

Note  that  when  the  additive  noise  on  the  projections  is 
white,  Rc{i,j)  is  equal  to  and  each 

pixel  constraint  affects  not  only  at  the  constrained 

pixels  (ii,  ji),  •  •  • ,  (fc.ic)  but  also  in  the  vicinity  of  each 
point.  The  size  of  this  vicinity  is  the  size  of  support  of 
Hence  constraining  some  pixel  values  also  affects 
other  pixel  values.  Depending  on  the  autocovariance  of 
rf{m,n),  the  support  of  R({i,j)  may  become  very  signifi¬ 
cant. 


III.  The  Wavelet  Transform 


The  wavelet  transform  is  a  time-frequency  tool  that  has 
good  localizing  properties.  The  2-D  continuous  wavelet 
transform  W2if{x,y)  of  a  2-D,  square-integrable  function 
f{x,y)  is  defined  as  the  convolution  of  f{x,y)  with  dila¬ 
tions  of  a  wavelet  basis  function  ip{x,y)  at  different  scales 


=  fix,y)  *  *‘il>2={x,y) 

rOO  rCC 

-  /  f{zi,Z2)^2>  (x  -  Zi,y  -  22)dZidZ2,  (5) 

J  — oo  */  — CC 


where  i)2i{x,y)  =  2~^ ip  {2~^ x,2~^ y) ,  and  i>(x,y)  is  the 
wavelet  basis  function. 

The  wavelet  transform  of  an  image  can  be  computed  di¬ 
rectly  from  its  Radon  transform.  Let  W2}fi{x,y)  be  the 
wavelet  transform  of  an  image  /i(z,t/)  and  let  ^(r,  = 

'll{il>{x,  y)}  denote  the  Radon  transform  of  the  wavelet  ba¬ 
sis  function  ip{x,y).  Using  the  scaling  and  convolution 
properties  of  the  Radon  transform  [2],  we  obtain 

W2,y.{x,y)-yi{x,y)  +  *ip2iix,y) 

=7^-'{p(^,0)*^(2-J^,^)}.  (6) 

Thus  the  wavelet  transform  of  the  reconstructed  image  can 
be  found  by  first  computing  a  wavelet-like  transform  of 
each  projection,  and  then  taking  the  inverse  Radon  trans¬ 
form. 

For  image  processing  applications,  several  sub-wavelets 
are  used,  each  having  a  specific  orientation  [3].  For  ex¬ 
ample,  for  edge  detections  applications,  two  wavelet  basis 
functions  ip^{x,y)  and  ip^{x,y)  are  used,  and  two  sets  of 
wavelet  coefficients  are  computed  at  each  scale: 

^2ifi^>y)  =  fi^^y)  *  *^l,{x,y), 

^'lf{x,y)  =  fix,y)  *  *il;l,(x,y).  (7) 


The  idea  of  filtering  a  signal  in  time-frequencv  space  has 
been  discussed  and  employed  for  the  Wigner  distribution 
and  short-time  Fourier  transform  in  [4]  and  [,5].  In  this 
paper,  we  perform  a  simple  version  of  multiscale  filtering 
by  setting  the  wavelet  coefficients  of  the  signal  to  zero  in 
regions  of  the  wavelet  transform  domain  where  the  signal 
energy  is  known  or  estimated  to  be  much  smaller  than  the 
noise  energy.  In  particular,  we  set  the  fine-scale  wavelet 
coefficients  to  zero  in  the  regions  of  the  image  which  are 
free  of  high-resolution  features.  Since  fine-scale  wavelet 
transform  components  represent  localized  high-resolution 
features  of  the  image,  windowing  these  to  zero  effectively 
smoothes  the  image,  much  as  lowpass  filtering  does.  The 
advantage  of  using  wavelets  is  that  this  can  be  done  on 
a  localized  basis,  smoothing  in  some  areas  while  leaving 
other  areas  (such  as  edges)  alone. 

The  region  in  the  wavelet  space  where  the  fine-scale 
wavelet  coefficients  are  set  to  zero  can  be  determined  from 
either  a  priori  knowledge  of  image  features  or  statistical 
knowledge  about  the  image  and  noise. 

A  priori  knowledge  of  image  features:  Often  in  image  re¬ 
construction  problems  there  is  considerable  a  priori  knowl¬ 
edge  about  the  image.  For  example,  it  may  be  known,  a 
priori,  that  there  are  no  high-resolution  features  in  some 
region  D  of  the  image.  D  may  be  known  to  represent  a  flat 
or  slowly-varying  part  of  the  image,  or  D  may  be  known  to 
be  free  of  edges.  Since  fine-scale  wavelet  transform  com¬ 
ponents  represent  localized  high-resolution  features  of  the 
image,  we  can  window  these  to  zero  in  the  region  D,  and 
cancel  some  of  the  noise  energy  in  D  without  degrading 
the  image. 

Statistical  knowledge  about  the  image  and  noise:  If  there 
is  no  a  priori  knowledge  of  image  features,  a  threshold¬ 
ing  approach  may  be  used  to  set  some  wavelet  coefficients 
to  zero.  The  idea  is  to  eliminate  noise  where  it  is  pos¬ 
sible  to  do  so  without  significantly  degrading  the  image. 
Since  the  wavelet  transform  is  being  used,  this  can  be  done 
on  a  localized  basis.  In  this  paper,  we  set  the  fine-scale 
wavelet  coefficients  to  zero  in  regions  where  they  fall  be¬ 
low  a  threshold,  a  method  that  was  previously  applied  to 
other  time-frequency  distributions  [5].  In  [1],  we  describe 
a  situation  where  this  thresholding  approach  is  optimal. 

V.  Complete  Procedure  and  Numerical 
Results 

The  results  of  Section  II  are  not  very  useful  in  themselves, 
since  it  is  most  unlikely  that  we  have  a  priori  knowledge 
of  the  actual  values  of  the  image.  However,  we  will  have 
constraints  on  the  values  (all  known  to  be  zero)  of  some 
wavelet  coefficients.  In  this  section,  we  modify  the  results 
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of  Section  11  so  that  constraints  on  wavelet  coefficients  are 
used  to  improve  the  image,  and  give  an  example, 

e  use  constraints  on  two  sub-wavelets  in  this  paper  and 
sume  that  the  discrete  wavelet  transform  and 

esumi  ed'^  (defined  by  (5)  and  (8))  is  known  or 

dpvT  f  L  scales.  We  in¬ 

dex  each  of  these  points  by  an  integer  cj  (/)  or  oU)  where  / 

denotes  the  scale  and  the  subscript  denoted  the  sub-^a^le, 

used.  It  IS  easily  shown  [1]  that  the  LMM.SE  noise  is  given 


C,tl) 

E 


2J 


i(' 


(9) 


L  C^f/J 
'=1  C2(/)=l 

•  cross-covariance  between  and 

svftem’3  ^  0®^'"  computed  by  solving  a  linear 

system  of  equations  similar  to  (4),  for  details  see  fl) 

less^'tm^ ^  '.-T  '*’\S^^PP-Logan  phantom  as  our  noise- 

es  image,  which  IS  shown  in  Fig,  1.  The  noise  added 

GaussiaXnoT*°"^  lX^^^‘”^'^  zero-mean  white 

Gaussian  noise  with  variance  4  x  IQ-^  through  a  filter 

12^8^1 98  transform  is  (sm(ti;))32 

FBF  1  Projections  using 

FBP  IS  shown  in  Fig.  2.  The  wavelets  that  we  use  are 

2-D  versions  of  the  I-D  Haar  wavelet,  which  are  defined  as 

^ovpr  ^  ^  r“  ^  where  sq{x,y)  is  unity 

#over  the  square  region  [0, 1]  x  [0, 1]  and  zero  elsewhere  We 
constrain  the  two  finest-scale  wavelet  coefficients  to  be  zero 

5  region  D  is  obtained 

by  thresholding  the  second-finest  wavelet  coefficients  of  the 

^  center  of  the  image  ThI 

Jl  beef  d  '  d  T  o  '^7"  rioise  power 

Jias  been  decre^ed  by  20,3%  in  the  whole  image  and  by 


Figure  1:  Noiseless  Phantom 


Figure  2:  Phantom  reconstructed  from  noisy  projections 
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Figure  3:  MMSE  image  obt^ned  by  constraining  the  wavelet 
coefficients  of  the  noisy  image. 
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APPENDIX  K1 

B.  Sahiner  and  A.E.  Yagle,  “Time-Frequency  Distribution  Inversion  of  the 
Radon  Transform,”  to  appear  in  IEEE  Trans.  Image  Proc.  2(4),  October  1993. 

This  paper  performs  a  time-frequency  analysis  of  the  projection  data  in  the  inverse 
Radon  transform  problem.  Regions  in  time-frequency  space  in  which  the  distribution 
strength  is  below  a  threshold  are  assumed  to  be  due  to  noise,  and  are  set  to  zero.  This 
has  the  effect  of  filtering  noise  out  of  time-frequency  regions  in  which  the  signal  strength 
is  small,  and  leaving  the  noise  in  where  the  signal  strength  is  large.  The  resulting  time- 
frequency  distribution  is  then  projected  to  find  the  nearest  feasible  signal  solution,  which 
is  then  backprojected.  This  reduces  noise  in  the  reconstructed  image  while  maintaining 
sharpness  of  image  features. 
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Fig.  9.  Top  left:  input  frames  with  1-D  motion  blurs.  Top  right:  input  frames 
with  2-D  blurs.  Bottom  left:  reconstructed  image  using  input  frames  from  the 
top  left  with  X  =  1,0.  Bottom  right:  reconstruaed  image  using  input  frames 
from  the  top  right  with  A  =  1.0. 


has  been  developed.  Due  to  the  deblurring  process,  the  high-resolution 
reconstruction  is  not  stable.  By  using  a  recursive  scheme  with 
an  adaptively  updated  regularization  parameter,  an  effective  high- 
resolution  reconstruction  has  been  obtained.  A  new  input  image  frame 
can  be  incorporated  to  the  reconstruction  in  a  very  efiicient  way 
without  repeating  the  whole  computation.  The  computation  can  be 
implemented  in  a  highly  parallel  scheme  since  all  the  DFT  compo¬ 
nents  of  the  reconstructed  image  are  computed  independently.  The 
regularization  parameters  should  be  chosen  appropriately,  balancing 
the  deblurring,  high-resolution  restoration  and  noise  amplifications. 
In  order  to  further  improve  the  results,  the  batch  mode  iterative 
computation  (for  fixed  number  of  input  frames)  may  be  incorporated 
between  the  recursive  reconstructions  with  new  input  frames. 
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Time-Frequency  Distribution 
Inversion  of  the  Radon  Transform 

Berkman  Sahiner  and  Andrew  E.  Yaele 


Abstract—  In  using  filtered  backprojection  to  compute  the  inverse 
^don  transform,  the  ramp  filter  amplifies  noise.  Spatiallv  invariant  noise 
filters  reduce  resolution.  It  is  desirable  to  filter  noise  where  projections 
have  no  local  high-frequency  componenU.  Using  the  short-time  Fourier 
transform,  we  apply  a  time-frequency  mask  filter  that  zeroes  out  pro¬ 
jections  where  local  signal  energy  is  below  a  threshold.  Results  show 
improvement  over  reconstructions  using  spatially-invariant  smoothing 


I.  INTRODUCTION 

The  Radon  transform  inversion  problem  is  the  basic  problem  of 
x-ray  tomography.  The  problem  is  to  reconstruct  an  image  p{x..y) 
from  its  projections  p(r.9)  where 


P(r.e] 


p(x.y)6{r  -  X  cos#  -  y  sin  9)dxdy 


(1) 


is  the  Radon  transform  of  p(x.y). 

The  most  common  procedure  for  image  reconstruction  from  pro¬ 
jections  is  filtered  backprojection  (FBP),  in  which  the  reconstruction 
is  carried  out  in  two  stages.  The  first  stage  is  filtering,  in  which  the 
projections  are  filtered  to  yield  the  filtered  projections  q(r.ff)  [1],  The 
second  stage  is  backprojection,  in  which  the  image  is  obtained  from 
filtered  projections  using 


fi(-i-.y) 


q(xcosff  +  ysinff,ff)dff. 


(2) 


In  the  filtering  stage,  the  frequency  response  of  the  ideal  filter  is  |tt’|; 
this  is  called  a  ramp  filter  [2],  A  problem  with  the  ramp  filter  is  that 
It  amplifies  the  high-wavenumber  or  high-frequency  components  of 
both  the  noise  and  the  data.  Since  noise  usually  dominates  at  high 
frequencies,  it  is  common  practice  to  use  a  low-pass  filter  ff(w} 
in  conjunction  with  |u;|  to  improve  the  signal-to-noise  ratio  (SNR). 
Hence,  in  practice,  we  have 


Q(i4'.#)  =  P(u;,#)|u'|ir(if)  (3) 

where  Q(w.  9)  and  P{w.9)  denote  the  Fourier  transforms  of  q{r.  9) 
and  p(r,#)  in  the  r  variable,  for  each  projection  angle  9.  Usually, 
H(w)  has  some  gentle  rolloff  characteristics  at  high  frequencies  to 
prevent  ringing  at  edges.  A  typical  shape  for  |ie|fl’(iii)  is  shown  in 
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iw I H  (w) 


Fig.  1.  The  SNR  improvement  obtained  by  using  H{  w)  comes  at  the 
expense  of  degraded  image  resolution,  since  H{w)  will  also  smooth 
the  edges  in  the  image. 

It  is  desirable  to  reduce  the  noise  energy  where  the  projection  en¬ 
ergy  is  small,  while  preserving  the  projection  energy  (and  necessarily 
the  noise  energy)  where  it  is  large.  However,  spatially  invariant  filters 
are  unable  to  have  their  characteristics  vary  in  a  spatially-varying 
manner,  so  this  selectivity  cannot  be  attained. 

For  this  paper,  we  use  time-frequency  (t-f)  distributions  of  p(r.9) 
(considering  r  as  the  time  variable  and  ^  as  a  constant  for  each 
projection  angle)  to  accomplish  spatially-varying  filtering.  In  the  next 
section,  we  summarize  the  basic  approach  used  in  spatially-varying 
filtering  based  on  a  t-f  distribution.  In  Section  III,  we  concentrate 
on  the  short-time  Fourier  transform,  which  is  chosen  as  the  t-f 
distribution  tool  in  this  correspondence.  We  present  the  application 
to  the  inverse  Radon  transform  and  numerical  results  in  Section  IV. 

II.  SPATIALLY- VARYING  FILTERING 

Based  on  t-F  distributions 

T-f  distributions  (representations)  describe  the  intensity  of  a  signal 
simultaneously  in  time  and  frequency.  Such  representations  are  of 
interest  when  dealing  with  nonstationaiy  signals. 

Numerous  definitions  of  the  t-f  distribution  F{w,t)  of  a  signal 
f(t)  have  been  suggested  [3].  The  most  familiar  representation  is  the 
short-time  Fourier  transform  (STFT),  which  is  reviewed  in  Section 
III.  STFT  is  a  member  of  a  larger  family  called  Gshen’s  class  of 
distributions  [3];  each  member  of  the  class  has  its  advantages  and 
drawbacks,  and  there  is  no  single  “correct”  choice  of  a  representation. 

A  signal  /(f)  can  be  filtered  in  the  time-frequency  domain  as 
follows.  The  first  step  is  to  multiply  F{w,t)  by  a  filter  W(w,t) 
to  yield  [4] 

Giw.t)  =  F{w.t)\V(iL\t).  (4) 

The  function  U  (w.t)  may  be  called  a  time-varying  transfer  function, 
since  it  represents  the  factor  by  which  different  frequency  components 


of  the  local  spectrum  around  the  time  t  are  multiplied.  The  multipli¬ 
cation  is  analogous  to  the  classical  time-invariant  filtering  in  which 
F(w)  is  multiplied  by  a  filter  transfer  function  W{w). 

If  one  is  interested  in  recovering  a  signal  from  noise-corrupted 
data,  then  a  typical  choice  for  W{w.t)  is  a  one-zero  mask  [5]-[9]. 
W(  w.t)\s  set  to  unity  in  the  region  of  the  t-f  plane  where  the  energy 
of  the  signal  is  above  a  threshold  [5],  [8].  This  region  is  called  the 
region  of  support  (ROS)  of  /(f)  in  the  t-f  plane.  If  the  signal  energy 
is  below  the  threshold,  then  W{w.t)  is  set  to  zero. 

After  G(w.t)  is  computed,  the  second  step  is  computation  of 
the  function  g{t)  whose  t-f  distribution  is  G(w.t).  g{t)  is  then  the 
time-frequency  variant  filtered  /(f).  However,  G(w.t)  may  not  be 
a  valid  t-f  distribution,  i.e.,  there  may  be  no  function  g(t)  whose  t-f 
distribution  is  G{w,t).  In  this  case,  one  possible  solution  is  to  find 
the  signal  g{t)  whose  t-f  distribution  best  approximates  G{w,  t)  in  the 
mean  squared  error  (MSE)  sense.  Note  that  this  error  measure  does 
not  necessarily  lead  to  the  ‘best’  filtered  signal,  but  is  chosen  because 
it  is  easy  to  deal  with.  Letting  G{w,t)  represent  the  t-f  distribution 
of  ^(f),  one  finds  g{t)  such  that 


D  = 


iG(u;.f)  —  G{w,t)\^dtdw 


(5) 


is  minimized.  We  show  how  to  find  this  g{t)  for  the  STFT  in  the 
next  section. 


lU.  THE  SHORT-TIME  FOURIER  TRANSFORM 
The  STFT  of  a  signal  /(f)  is  defined  as  [4] 


F(w.  t) 


T)h(t  —  T)e 


-^'‘^dT 


(6) 


where  h{t)  is  called  the  analysis  filter.  The  shape  and  length  of 
h(t)  are  important:  Too  short  a  filter  may  result  in  poor  frequency 
resolution,  and  too  long  a  filter  may  result  in  poor  temporal  resolution. 
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For  a  discrete-time  signal  fik).  the  discrete  short-time 
transform  (DSTFT)  is  defined  as  [lOJ: 


Fourier 


//)  .  k‘ 


=  f  .l'h:  nk-  -  1:, 


(7) 


size,  1 1  =  2t/.U  ,,s  the  .sampling  period 

m  t^e  trcqucnc>  domain.  /?  is  the  sampling  period  in  the  time  domain, 
■  ■  ■  ■  -'7  -  1  IS  the  frequency  variable,  and  k  is  the  time 

variable. 

With  mild  conditions  on  the  analysis  filter  (e.g.,  if  /?  <  ,U.  the 
first  R  samples  ot  h.i,:  should  be  nonzero,  or  if  /?  divides  .U,  none 
of  the  R  polyphases  of  hun  should  be  identicallv  zero)  fth)  can  be 
recovered  exactly  from  Fi  ,  A- i  [  1 1],  The  inversion  relation  is 


s-  ^  '.f-1 

/(/)  =  Y  ^ 

‘=-SC  ‘  .T1=0 

where  .s(/)  is  called  the  synthesis  filter  and  is  determined  bv  the 
analysis  filter  /?(/),  An  algebraic  approach  to  determine  s(l)  for  exact 
reconstruction  is  given  in  [12J, 

When  we  wish  to  perform  time-frequency  filtering  to  obtain  a 
filtered  stgnal  g(k-)  from  /(A  ),  we  first  multiply  F{nt.  A)  bv  a  mask 
U  (ni.A)  (compare  to  (4)) 

G(/n,  A)  =  F(m,  A)ir(m,  A')  (9) 

and  then  we  find  the  function  g{k)  whose  DSTFT  is  closest  to 
Gim.k).  The  solution  to  the  problem  of  finding  g{k)  from  G{m.k) 
has  been  addressed  in  [13],  In  particular,  when  the  analysis  filter 
length  A  is  not  larger  than  the  transform  length  M,  then  the  synthesis 
equation  (8)  applied  to  G{m.k)  gives  the  g(k)  minimizing  the 
discrete  counterpart  to  (5), 


Fig.  2.  Reconstruction  from  noiseless  data. 


Fig.  3.  The  best  reconsiniction  from  noisy  data  using  filtered  backprojection 
and  a  spatially  invariant  filter. 


IV.  APPLICATION  TO  RLTERED  BACKPROJECTION 
Now  suppose  the  projection  data  are  noisy.  Then  (3)  becomes 

Q{w.9)  =:  P„{ii\9)\w\H{w)  (10) 

where  the  subscript  n  in  (10)  indicates  that  the  projections  are 
corrupted  by  noise. 

Our  idea  is  to  spatially  vary  the  bandwirWj  of  H(w).  Ut  us  define 

R(ni.k.fl  )  H  {m.  k.6)P„{m.  k.8)  (11) 

for  each  angle  t)  where 

tri  m,  A.  0)  =  /  ^  of  support  (ROS) 

*■  0  otherwise 

and  P,,{  m.k.9)  denotes  the  DSTFT  of  p„lr.  9)  for  a  fixed  9.  Here 
Pn{r.9)  has  been  discretized  in  r.  Notice  that  the  radial  variable 
T  plays  the  role  of  the  time  variable  in  the  DSTFT,  i.e.,  we  are 
applytng  a  time-frequency  representation  to  process  the’ data  in 
spatially-varying  fashion. 

determined  from  a  priori  knowledge  about  it  and  the 
USTFT  of  p(r.9)  with  respect  to  the  variable  r.  One  possibility  is 
to  estimate  the  ROS  using  the  threshold  test  given  in  Section  II, 
l  e.,  define  the  ROS  to  be  the  region  in  which  the  DSTFT  amplitude 
threshold.  We  then  obtain  p(r.9)  as  the  signal  such  that 
the  MSE  between  the  DSTFT  of  p(  r.  ^ )  and  i?(  m .  A.  0 )  is  minimized 
her  a  suitable  choice  of  the  analysis  filter  h,  this  can  be  done  using 
the  synthesis  equation  given  in  Section  III. 

Using  this  p,  we  proceed  with  (3)  to  find  the  filtered  projections. 


Fig.  4.  The  best  reconsmiction  fitim  noisy  data  using  a  t-f  mask. 


Example  I:  Consider  the  following  function,  defined  over  the 
square  -1  <  z,  y  <  1; 

14(x.3,)=/l 

I  0  otherwise 

==>  pCt,  )  =  p(r)  =  /  r<R 

1 0  otherwise. 

In  the  simulations  below,  we  have  used  samples  of  p(r.  9)  with  128 
samples  in  the  r  variable  (over  -1  <  /•  <  1)  and  50  samples  in  the 
9  variable  (over  Q  <  9  <  ir). 

Reconstruction  from  noiseless  p(r.9)  using  FBP  is  given  in  Fig. 
2.  To  carry  out  the  backprojection,  linear  interpolation  is  used  in  the 
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Fig.  5.  Comparison  of  Figs.  2,  3,  and  4  along  the  cross-section  j-  =  25.  Lines  correspond  to  Fig.  2,  short  dashed  lines  correspond 

to  Fig.  3,  and  long  dashed  lines  correspond  to  Fig.  4. 


radial  variable,  and  a  “small  artifact”  inteipolation  is  employed  in  the 
angular  variable  since  the  number  of  angular  samples  is  less  than  the 
number  of  radial  samples  [2]. 

Next,  we  reconstruct  nix.y)  from  noisy  projection  data  Pnlr.9) 
obtained  by  adding  white  Gaussian  noise  with  zero  mean  and  variance 
0.0036  to  the  projections.  The  “best”  (in  the  sense  of  subjective 
human  observation)  reconstruction  using  FBP  and  a  time-invariant 
filter  HliL')  is  given  in  Fig.  3.  The  optimal  Hlw)  was  found  by 
changing  parameters  ki  and  A.-2  in  Fig.  1  and  choosing  ki  =21  and 
A'2  =  35,  which  gave  the  best  reconstruction. 

We  now  present  the  result  usin^t-f  filtering.  We  choose  our  analysis 
filter  to  be  Gaussian,  h{n)  =  with  length  32  (i.e.,  we  assume 

h{k)  —  0  for  k  >  16  and  k  <  —16).  We  also  define  the  t-f  mask 
\V{m.  k)  as 


W(m.A:)  =  /l  if  >  t.  .^4, 

I,  0  otherwise 

where  iv  is  a  fixed  threshold,  found  by  choosing  the  value  that  gives 
the  best  reconstruction.  The  resulting  reconstructed  image  is  given 
in  Fig.  4. 

To  compare  the  reconstructions  of  Fig.  3  and  Fig.  4  with  the 
noiseless  reconstruction  Fig.  2,  we  also  plot  a  cross-section  through 
each  image  along  x  =  25  in  Fig.  5.  The  noiseless  reconstruction  is 
close  to  unity  in  the  center  of  the  image,  as  expected.  Reconstruction 
with  a  t-f  mask  follows  the  noiseless  reconstruction  closely  at  the 
edges,  due  to  the  fact  that  the  projection  energy  in  the  t-f  plane 
is  preserved  where  it  is  expected  to  be  large.  However,  this  does 
not  result  in  increased  noise  energy  (compared  to  space-invariant 
filtering)  at  other  points  in  the  image. 

Example  2:  We  now  apply  the  algorithms  given  in  Example  1 
to  a  frequently  used  phantom  in  medical  imaging.  The  phantom  is 
supposed  to  be  a  section  of  the  human  head  with  the  denser  (high 
p)  areas  indicating  tumors  and  the  less  dense  areas  indicating  spinal 
fluid.  The  reconstruction  from  noiseless  data  is  given  in  Fig.  6.  TTie 
effect  of  the  skull  has  been  removed  using  bone  deleting  techniques 
[14]. 


Fig.  6.  Reconstruction  from  noiseless  data. 

The  additive  noise  is  again  Gaussian  with  variance  10~®.  The  best 
reconstruction  fiom  noisy  data  using  FBP  and  a  time  invariant  filter 
is  given  in  Fig.  7  (fei  =  34  and  ifcj  =  37). 

Reconstruction  using  a  t-f  mask  is  given  in  Fig.  8.  Note  that 
compared  to  Fig.  7,  all  of  the  laiger,  elliptical  objects  in  Fig.  8  contain 
less  noise;  however,  the  same  degree  of  resolution  is  maintained  at 
the  edges.  Also,  the  noise  in  the  background  (which  is  ideally  flat)  a 
reduced,  and  therefore  the  small  peaks  in  the  front  are  more  easily 
distinguished  from  the  background. 

V.  CONCLUSION 

We  have  applied  the  idea  of  time-frequency  masking  to  the 
inversion  of  the  Radon  transform.  This  results  in  a  spatially -varying 
filter  which  regularizes  the  |ai|  filter,  reduces  the  noise,  and  still 
preserves  local  high  frequency  features  such  as  edges.  Two  examples 
illustrate  the  improvement  over  spatially-  invariant  filtering. 
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TIME-FREQUENCY  DISTRIBUTION  INVERSION 
OF  THE  RADON  TRANSFORM 


Berkman  Sahiner  and  Andrew  E.  Yagle 
Dept,  of  Electrical  Engineering  and  Connputer  Science 
The  University  of  Michigan,  Ann  Arbor,  Michigan  48109 


y^bstract 


using  filtered  backprojection  to  compute  the  inverse 
Hjdon  transform,  the  Hilbert  transform-derivative  filter- 
,  operation  amplifies  noise.  Filtering  the  noise  generally 
fjduces  resolution  and  smoothes  edges.  It  would  be  desir¬ 
able  to  spatially  filter  the  noise  where  the  projections  do 
jot  have  high-frequency  components,  and  accept  the  noise 
*here  high-frequency  components  are  present,  since  these 
oust  be  preserved  for  the  Hilbert  transform-derivative. 
Using  the  short-time  Fourier  transform,  we  apply  a  time- 
ftequency  mask  filter  that  zeroes  out  each  projection  in  lo- 
jjtions  where  local  signal  energy  is  below  a  threshold.  Re- 
jults  show  improvement  over  reconstructions  using  time- 
invariant  smoothing  filters. 


0 


I 


1  Introduction 

The  Radon  transform  inversion  problem  is  the  basic  prob¬ 
lem  of  x-ray  tomography.  The  problem  is  to  reconstruct 
an  image  n{x,y)  from  its  projections  p{r,6)  where 

/oo  ^OO 

/  p{x,y)6{r  -  X  cosd  -  ys'm9)dxdy  (1) 

•OO  */  — OO 

is  the  Radon  transform  of  /i(z,  y) 

The  most  common  procedure  for  inverting  the  Radon 
transform  is  filtered  backprojection  (FBP),  in  which  the 
projections  p(r,  d)  are  first  filtered  to  yield 

Q(w,e)  =  Piw,e)\w\  (2) 

•here  P{w,d)  is  the  Fourier  transform  of  p{r,9)  smd  |u;| 
performs  the  Hilbert  transform  and  derivative  operations. 
The  image  p{x,y)  is  obtained  by  backprojecting  q{r,d), 
^he  inverse  Fourier  transform  of  Q{w,9)  using 

1  f' 

p(x,y)  =  —  /  q{xcoe9  +  y8ia9,9)d9  (3) 

2*'  Jo 

A  problem  with  FBP  is  that  the  filter  |u;|  amplifies  the 
high-wavenumber  or  high-frequency  components  of  both 
noise  emd  the  data.  Since  noise  usually  dominates  at 
“*gh  frequencies,  it  is  common  practice  to  use  a  low-pass 
®-7S03q)5i3.2/92$O.3.0O  ©IEEE 


Figure  1:  A  typical  shape  for  |u;lAf(u;) 

filter  H{w)  in  conjuction  with  |u;|  to  improve  the  signal- 
to-noise  ratio  (SNR).  Hence,  in  practice,  we  have 

Qiw,9)  =  Piw,9)\w\Hiw)  (4) 

where  H{w)  has  some  gentle  rolloff  characteristics  at  high 
frequencies  to  prevent  ringing  at  edges.  A  typical  shape 
for  |u;j/f(u;)  is  shown  in  Figure  1.  However,  the  SNR  im¬ 
provement  obtained  by  using  H{w)  comes  at  the  expense 
of  degraded  image  resolution,  since  H{w)  will  also  smooth 
the  edges  in  the  image. 

It  is  desirable  to  reduce  the  noise  energy  where  the  projec¬ 
tion  energy  is  small,  while  preserving  the  projection  energy 
(and  necessarily  the  noise  energy)  where  it  is  large.  But 
spatially  invariant  filters  are  unable  to  have  their  charac¬ 
teristics  vary  in  a  nonstationary  manner,  so  this  selectivity 
cannot  be  attained. 

In  this  paper,  we  use  time-frequency  (t-f)  distributions 
of  p{r,9)  (considering  r  as  the  time  variable  and  S  as  a 
constant  for  each  projection  angle)  to  accomplish  spatially 
variant  filtering.  In  the  next  section,  we  summarize  the 
basic  approach  used  in  spatially  varying  filtering  based  on 
a  t-f  distribution.  In  Section  III,  we  concentrate  on  the 
short-time  Fourier  transform,  which  is  chosen  as  the  t-f 
distribution  tool  in  this  paper.  We  present  the  applica¬ 
tion  to  Inverse  Radon  transform  and  numerical  results  in 
Section  IV. 


2  Spatially-Varying  Filtering 
Based  on  T-F  Distributions 

T-f  distributions  (representations)  describe  the  intensity 
of  a  signal  simultaneously  in  time  and  frequency.  Such 
representations  are  of  interest  when  one  is  dealing  with 
nonstationary  signals. 

.Numerous  definitions  of  the  t-f  distribution  F{w,t)  of  a 
signal  f(t)  have  been  suggested  [1,  2,  3].  The  most  familiar 
representation  is  the  short-time  Fourier  transform  (STFT), 
which  is  reviewed  in  Section  III.  STFT  is  a  member  of  a 
larger  family,  called  Cohen’s  class  of  distributions  [4].  Each 
member  of  the  class  has  its  advantages  and  drawbacks,  and 
there  is  no  single  ”correct”  choice  of  a  representation. 

A  signal  f{t)  can  be  filtered  in  the  time-frequency  domain 
as  follows.  First,  multiply  F(w,t)  by  a  filter  W(w,t)  [5], 
yielding 

Giw,t)=  F{w,i)W{w,t)  (5) 

The  function  W(w,  t)  may  be  called  a  time-varying  transfer 
function,  since  it  represents  the  factor  by  which  different 
frequency  components  of  the  local  spectrum  around  the 
time  t  are  multiplied.  The  multiplication  is  analogous  to 
the  classical  time-invariant  filtering  in  which  F{w)  is  mul¬ 
tiplied  by  a  filter  transfer  function  W{w). 

If  one  is  interested  in  recovering  a  signal  from  noise- 
corrupted  data,  then  a  typical  choice  for  W{w,t)  is  a  one- 
zero  mask  [6].  W{w,t)  is  set  to  1  in  the  region  of  the  t-f 
plane  where  the  energy  of  the  signal  is  above  a  threshold. 
This  region  is  called  the  Region  of  Support  (  ROS  )  of  /(<) 
in  the  t-f  plane.  If  the  signal  energy  is  below  the  threshold, 
then  W{w,t)  is  set  to  0.  The  threshold  value  is  usually  set 
arbitrarily,  and  little  is  understood  about  the  effects  of  the 
size  of  the  one-zero  mask  on  the  filtered  signal  [6]. 

Once  G{w,t)'\s  computed,  one  can  proceed  to  find  the 
function  g{t)  whose  t-f  distribution  is  G{w,t).  g{t)  is  then 
the  spatially-variant  filtered  f{t).  However,  G{w,t)  may 
not  be  a  valid  t-f  distribution.  That  is,  there  may  be  no 
function  g{t)  whose  t-f  distribution  is  G{w,t).  In  this  case, 
one  tries  to  find  the  signal  g{t)  whose  t-f  distribution  best 
approximates  G{w,  t)  in  the  mean  squared  error  sense.  Let¬ 
ting  G{w,  t)  represent  the  t-f  distribution  of  g{t),  one  finds 
g{t)  such  that 

D=  r  r  \6{w,t)-G{w,t)\^dtdw  (6) 

•/-oo  j -00 

is  minimized.  We  show  how  to  find  this  g{t)  for  the  STFT 
next. 


3  The  Short-Time  Fourier  Trans¬ 
form 


The  STFT  of  a  signal  f(t)  is  defined  as  [5] 


F{w, 


/(r)fi(f-  r)e--'“'"dr 


(7) 


4  Numerical  Results 


Consider  (4)  again 

where  the  subscript  n  in  (11)  indicates  that  the  projections 
are  corrupted  by  noise. 

Our  ides  is  to  spatially  vary  the  bandwidth  of  H{w). 
us  define 


where  h(t)  is  called  the  analysis  filter  or  the  window  fane, 
tion.  The  shape  of  h{t)  and  its  length  L  are  important 
Too  short  a  window  may  result  in  poor  frequency  resolu- 
tion  and  too  long  a  window  may  result  in  poor  temporal 
resolution. 

For  a  discrete-time  signal  /(t),  the  dlscete  short-iime 
Fourier  transform  (DSTFT)  is  defined  as  [7]: 


CO 

Pn„,fl(m,l:)=  ^  f{l)h{Rk  -  (gj 

/=— OO 

where 

M  is  the  transform  size 

Qm  =  2nlM  is  the  sampling  period  in  the  fre¬ 
quency  domain 

R  is  the  sampling  period  in  the  time  domain 
m  =  0, 1,  ■  •  • ,  A/  —  1  is  the  frequency  variable 
k  is  the  time  variable. 

With  a  mild  condition  on  the  analysis  filter,  f{k)  can 
be  recovered  exactly  from  F{m,k)  for  R  <  A/  [8],  The 
inversion  relation  is: 


oo  ^  Af- 1 

/(o=  E  L  (9) 

^  =  — oo  msO 

where  s{l)  is  cal)ed  the  synthesis  filter  and  is  dictated  by 
the  analysis  filter  /»(/).  An  algebraic  approach  to  determine 
s(/)  for  e-xaict  reconstruction  is  given  in  [9]. 

When  we  wish  to  do  time-frequency  filtering  to  obtain  a 
filtered  signal  g{k)  from  f{k),  we  first  multiply  F{m,k)  by 
a  mask  W(m  !:)  (compare  to  (5)) 

G(m,k)  =  F{m,k)W(m,k)  (10) 

and  then  we  find  the  function  g(k)  whose  DSTFT  is  clos¬ 
est  to  G{m,k).  The  solution  to  the  problem  of  finding 
y(i)  from  G{m,k)  has  been  addressed  in  [10].  In  partic¬ 
ular,  when  the  analysis  filter  length  N  is  not  larger  than 
the  transform  length  M,  then  the  synthesis  equation  (9) 
applied  to  C{m,k)  gives  the  g{k)  minimizing  the  discrete 
counterpart  to  (6). 


R(m,i,i?)  =  W{m,k,e)P,,{m,k,e)  ( 

for  each  angle  B  where 

in  the  region  of  support  (ROS) 
otherwise 


W{m 


o' 
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d  denotes  the  DSTFT  of  Pn{^<S)  for  a  fixed 

^  Were,  Pn{r,d)  has  been  discretized  in  r. 

Xhe  interest  is  determined  from  a  priori  knowl- 

jge  about  the  ROS  and  the  DSTFT  of  p(r,  d)  with  respect 
the  variable  r.  One  possibilty  for  determining  the  region 
(■interest  is  to  estimate  the  ROS  using  the  threshold  test 
gn  in  Section  II.  That  is,  define  the  ROS  to  be  the  region 
'f,  which  the  signal  amplitude  exceeds  some  threshold. 

W'e  >^hen  obtain  p{r.d)  as  the  signal  such  that  the  MSE 
between  the  DSTFT  of  p(r,  9)  and  R{m,  k,  6}  is  minimized. 
p-Qt  a  suitable  choice  of  the  analysis  filter  h,  this  can  be 
jgne  using  the  synthesis  equation  given  in  Section  III. 
L'sing  this  p,  we  proceed  with  (4)  to  find  the  filtered  pro¬ 
jections. 

Dcample  1  :  Consider  the  following  function,  defined  over 


-1  <  y  <  1 

p(x,y)  =  1 

'  1  x^  +  y^  < 

0  otherwise 

(14) 

=>  P{r,9)  =  p(r)  = 

r  2\//?2  -  r<R 

(  0  otherwise 

(15) 

[n  the  simulations  below,  we  have  used  samples  of  pir,d) 
with  128  samples  in  the  r  variable  (over  —  1  <  f  <  1)  and 
50  samples  in  the  9  variable  (over  0  <  ^  <  t). 
Reconstruction  from  noiseless  p{r,9)  using  FBP  is  given 
in  Figure,  2.  To  carry  out  the  backprojection,  linear  inter¬ 
polation  is  used  in  the  r  variable  and  a  ’’small  artifact” 
interpolation  is  employed  in  the  angular  variable  [11|. 


0 

Figure  2:  Reconstruction  of  /i(x,y)  for  Example  I,  with 
noiseless  data 

Next,  we  reconstruct  fi{x,y)  from  noisy  projection  data 
Pn(*',  9)  obtained  by  adding  white  Gaussian  noise  with  zero 
mean  and  variance  3.6  x  10”®  to  the  projections.  The 
"best”  (in  the  sense  of  subjective  human  observation)  re¬ 
construction  using  FBP  and  a  time-invariant  filter  H{w)  is 
given  in  Figure  3.  The  optimal  H{w)  was  found  by  chang¬ 
ing  parameters  ki  and  ifc2  in  Figure  1  and  choosing  the  val¬ 
ues  that  give  the  best  reconstruction,  which  are  ki  =  21 
and  ij  =  35. 


Figure  3:  The  best  reconstruction  from  noisy  data  with  a 
shift  invariant  filter 


We  now  present  the  result  using  t-f  filtering.  In  order 
to  use  the  time-frequency  approach,  we  first  choose  our 
analysis  filter  to  be  Gaussian, 

/»(n)  = 


with  length  32  (i.e.  we  aissume  h(k)  =  0  for  |it|  >  16).  We 
abo  define  the  t,-f  mask  W(m,  k)  as 


W{m,k) 


1  if  |/’(m,ib)|2  >  1/ 
0  otherwbe 


(16) 


where  b  a  fixed  threshold,  found  by  choosing  the  value 
that  gives  the  best  reconstruction. 

The  resulting  reconstructed  image  is  given  in  Figure  4. 

To  compare  the  reconstructions  of  Figure  3  and  Figure 
4  with  the  nobeless  reconstruction,  we  also  plot  a  cross- 
section  through  each  image  along  x  =  25  in  Figure  5.  The 
nobeless  reconstruction  b  close  to  unity  in  the  center  of 
the  image,  as  expected.  Reconstruction  with  a  t-f  mask 
follows  the  nobeless  reconstruction  closely  at  the  edges, 


□ 

Figure  4:  The  best  reconstruction  from  noby  data  with  a 
t-f  mask 
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Figure  5:  Comparison  of  Figures  2,  3,  and  4  along  the  cross 
section  x=25.  Lines  correspond  to  Figure  2,  short  dashed 
lines  to  Figure  3,  and  long  dashed  lines  to  Figure  4. 


due  to  the  fact  that  the  projection  energy  m  the  t-f  plane 
is  preserved  where  it  is  expected  to  be  large.  However, 
this  does  not  result  in  increased  noise  energy  (compared 
to  space-invariant  filtering)  at  other  points  in  the  image. 

Example  2  :  We  now  apply  the  algorithms  given  in  Ex- 
ample  1  to  a  frequently-used  phantom  in  medical  imaging. 
The  phantom  is  supposed  to  be  a  section  of  the  human 
head  with  the  denser  (high  a«)  areas  indicating  tumors  and 
the  less  dense  areas  indicating  spinal  fluid.  The  reconstruc¬ 
tion  from  noiseless  data  is  given  in  Figure  6.  The  effect  of 
the  skull  has  been  removed  using  bone  deleting  techniques 

[12].  .  ,  , 

The  best  reconstruction  from  noisy  data  (noise 
variance=10-'‘)  using  FBP  and  a  time  invariant  filter  ^ 
given  in  Figure  7,  found  by  choosing  fci  =  34  and  fc:  -  37. 
Reconstruction  using  a  t-f  mask  is  given  in  Figure  8. 

Note  that  in  Figure  8,  the  oval,  high-density  area  in  the 
center  contains  \esa  noise  and  hence  is  easier  to  notice. 
Also,  the  noise  in  the  background  (which  is  ideally  flat)  is 
reduced,  and  therefore  the  smaU  peaks  in  front  of  the  oval 
area  are  more  eiaily  datinguished  from  the  background. 


5  Conclusions 

We  have  applied  the  idea  of  time-frequency  masking  to 
the  inversion  of  the  Radon  transform.  Tha  rra^ts  in  a 
spatially-varying  filter  which  regularizes  the  |u;l  filter,  re¬ 
duces  the  noise,  and  still  preserves  local  high  frequency 
features  such  as  edges.  Two  examples  illustrate  the  im¬ 
provement  over  spatially-invaxiant  filtering. 


[1]  A.  W.  Rihaczek,  "Signal  energy  distribution  in  time 
and  frequency,”  IEEE  Trans.  Inform.  Theory,  vol  IT- 
14,  p.  369,  1968. 

[2]  T.  A.  C.  M.  Claasen  and  W.  F.  G.  Mecklen- 
brauker,  ’’The  Wigner  distribution  -  A  tool  for  time- 
frequency  signal  analysis  Part  I;  Continuous-time  sig¬ 
nals,”  Phillips  J.  Res.,  vol.  35,  pp.  217-389,  1980. 

[3]  H.  I.  Choi  and  W.  J.  Williams,  "Improved  time- 
frequency  representation  of  multicomponent  signals 
using  exponential  kernels,  IEEE  Trans.  Acoust., 
Speech,  Sig.  Proc.,  vol.  37,  pp.  862-871,  1989. 

[4]  L.  Cohen,  ”  Time-frequency  distributions  -  a  review,” 

^  Proc.  IEEE,  Vol  77,  pp.  941-981,  July  1989. 

[5]  B.  E.  A.  Saleh  and  N.  S.  Subotic,  "Time-variant  filter¬ 
ing  of  signals  in  the  mixed  time-frequency  domain,” 
IEEE  Trans.  Acoust.,  Speech,  Sig.  Proc..  vol.  33, 
pp. 1479-1487,  1985. 

[6]  T.  E.  Koczwara  and  D.L.  Jones,  On  mask  selection 
for  time-varying  filtering  using  the  Wigner  Distribu¬ 
tion,”  in  Proc.  ICASSP  1990,  pp.  2487-2490. 

[7]  M.  R.  Portnoff  ”  Representation  of  digital  signals  and 
systems  based  on  the  Short  Time  Fourier  Transform.” 
IEEE  Trans.  Acoust.,  Speech,  Sig.  Proc.,  vol.  28,  pp. 
55-69,  1980. 

[8]  Z.  Shpiro  and  D.  Malah.  ”An  algebraic  approach 
to  discrete  short-time  Fourier  transform  analysis  and 
synthesis,”  IEEE  ICASSP-84,  pp.  2.3.1-2.3.4,  pp.  804- 
807. 

[9]  Z.  Shpiro  and  D.  Malah,  "Design  of  filters  for  D^ 
Crete  Short  Time  Fourier  'Ransfortn  synthesis,  IEEE 
ICASSP-85,  pp.  14.6.1-14.6.4,  pp.  537-540. 

[10]  A.  Dembo  and  D.  Malah,  "Signal  synthesb  from  i^- 
ified  Discrete  Short  Time  Fourier  Transform,  lEEfc 
Trans.  Acoust.,  Speech,  Sig.  Proc.,  vol.  34,  pp.l68- 
180,  Feb.  1988. 

[11]  R.  A.  Brooks,  G.  H.  Weiss  and  A.  J.  Talbert,  "A  new 
approach  to  interpolation  in  computed  ‘omogr^hy, 

J.  Comput.  Assist.  Toimg.,  vol  2  pp.  577-585,  19 

[12]  G.  Henrich,  N.  Mai  and  M.  Backmund  "Preprocj^ 

ing  in  CT  picture  analysis;  A  bone  deletmg 
rithm,”  J.  Comput.  Assist.  Tomogr.,  Vol  3,  pp  3  > 

384,  1979. 


Figure  6:  Reconstruction  with  noiseless  data  for  Example 


Figure  8:  Reconstruction  from  noisy  data  with  a  t-f  mask 


7:  Reconstruction  from  noisy  data  with  a  shift  in- 
'^fiant  filter 
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APPENDIX  LI 

B.  Sahiner  and  A.E.  Yagle,  “A  Fast  Algorithm  for  Backprojection  with 
Linear  Interpolation,”  to  appear  in  IEEE  Trans.  Image  Proc.  2(4),  October 
1993. 

This  paper  derives  a  simple  fast  algorithm  for  backprojection  in  the  inverse  Radon 
transform.  Interpolating  and  backprojecting  four  views  at  once  saves  half  the  multiplica¬ 
tions. 
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Fig.  3(a)  shows  reconstructed  Shepp  and  Logan’s  head  phantom 
image  using  the  circular  fan-beam  formula  with  a  scanning  circle  of 
diameter  G.  Fig.  j(b)  shows  the  reconstructed  image  corresponding 
to  Fig.  :(a)  using  our  noncircular  fan-beam  formula  with  a  scanning 
square  of  side-iength  G.  Fig.  3{c)  shows  the  plots  of  the  i/  =  0.  i9»  line 
tor  Fig.  3(a)  and  a(b).  It  can  be  observed  that  the  reconstructed  results 
obtained  using  different  fan-beam  formulas  are  almost  the  same  in 
the  reconstruction  region.  .More  simulation  results  were  presented  in 
(10).  The  parametei's  ot  the  above  simulation  are  tvpical  in  our  x- 
ra>  microtomographic  system  [li|.  It  can  be  verified  that  the  square 
scanning  locus  used  indeed  meets  all  three  conditions. 


V.  CONCLUSION 

As  far  as  the  mathematical  form  is  concerned,  the  proposed  formula 
IS  the  simplest  among  noncircular  fan-beam  formulas,  as  it  is  the  same 
as  the  circular  fan-beam  formula  [9]  except  that  the  source-to-origm 
distance  depends  on  the  rotation  angle.  However,  we  would  like  to 
emphasize  that  our  formula  is  exact  only  with  a  symmetric  scanning 
locus.  If  this  symmetry  condition  is  violated,  other  noncircular  fan- 
beam  formulas  should  be  used  for  exact  reconstruction,  provided  that 
the  relevant  conditions  required  are  satisfied. 

The  main  difference  between  Weinstein’s  formula  [3]  or  Gullberg’s 
formula  [5]  and  the  proposed  formula  lies  in  that  the  new  formula 
requires  no  derivative  of  the  scanning  locus  with  respect  to  the 
rotation  angle.  Smith’s  extended  fan-beam  formula  also  contains  a 
derivative  of  the  scanning  locus  [4].  For  an  irregular  scanning  locus, 
it  may  not  be  trivial  to  estimate  accurately  its  derivative.  For  example, 
in  our  x-ray  microtomographic  study,  the  scanning  locus  is  subject 
to  random  interferences  introduced  by  the  mechanical  rotation  of  the 
specimen  stage  [11]-[13].  As  a  result,  a  precise  estimation  of  the 
derivative  of  the  scanning  locus  is  particularly  difficult.  On  the  other 
hand,  the  practical  scanning  locus  used  in  x-ray  microtomography 
can  be  made  to  meet  our  three  conditions  [11H13].  Without  using 
the  derivative  of  the  scanning  locus,  the  proposed  formula  will  not 
be  affected  by  the  error  in  estimating  the  derivative. 
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A  Fast  Algorithm  for  Backprojection 
with  Linear  Interpolation 

Berkman  Sahiner  and  Andrew  E.  Yagle 


Aiarocr— In  the  filtered  backprojection  procedure  for  image  recon- 
stniction  from  projections,  backprqjcction  dominates  the  computation 
ome.  We  propose  a  simpie  algorithm  that  reduces  the  number  of  muiti- 
piications  in  linear  interpolation  and  backprojection  stage  by  50%,  with 
a  small  increase  in  the  number  of  addHioos.  The  algorithm  performs 
the  interpolation  and  backprojection  of  four  views  together.  Examples 
of  implementation  are  given  and  extension  to  interpolation  of  more  than 
four  views  is  discussed. 


1.  INTRODUCTION 

The  basic  problem  of  x-ray  tomography  is  to  reconstruct  an  image 
li(jr.y)  from  its  projections  p{r.9)  where 
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p{r.9) 


p{x.  y)6{r  -  xcos9  -  y  sin  9)dxdy 


(1) 


is  the  Radon  transform  of  /i(x.y). 

The  most  common  procedure  for  image  reconstruction  from  pro¬ 
jections  is  filtered  backprojection  (FBP),  in  which  the  reconstruction 
is  carried  out  in  two  stages.  The  first  stage  is  filtering,  in  which  the 
projertions  are  filtered  by  a  modified  ramp  filter  to  yield  the  filtered 
projections  q(r.9)  [1],  The  second  stage  is  backprojection,  in  which 
the  desired  image  p(x,y)  is  obtained  from  filtered  projections  by 
backprojection. 

In  practical  problems,  we  have  only  samples  of  p(r.S).  Let  us 
denote 


P,(R)  =  p{r.9,) 


.R  =  (i.  -,M-l.i 


r  —  1) 


L  ■  ■  • .  .V 

(2) 
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where  .V  is  the  total  number  of  views,  M  is  the  number  of  samples 
in  each  view,  anci  ./  is  the  radial  sampling  distance. 

Similarly,  let  Q,:/?'-  denote  the  discrete  filtered  projections,  ob¬ 
tained  by  filtering  Po/?i  using  a  modified  ramp  filter.  Then  the 
backprojection  operation  is 


Let  us  now  express  <j,.i  U,  j/)  and  y,  i(  t,  y ;  using  the  same  factors. 
Applying  the  sine  and  cosine  addition  formulas  to  (5)  yields 

-7, .  1 !  JL  a)  =  7, _  X  Yr  =  Yr i  'O  > 


II'  r.  I  =  Y  ^ '/o  w,  +  ,y>in<y,  1  (3) 

where  /  n  =  7'  , is  obtained  from  QaR)  by  some  kind  of 
interpolation. 

In  the  FBP  method,  backprojection  dominates  the  computation 
time.  The  filtering  stage  can  be  done  using  the  fast  Fourier  transform 
(FFT),  which  requires  (3(  .l/Zogj  .\/)  operations.  On  the  other  hand, 
in  the  backprojection  stage,  interpolation  needs  to  be  done  for  every 
■r.  ij.  and  1  in  (3);  therefore  if  linear  interpolation  (requiring  1 
multiplication  per  interpolated  point)  is  used,  XL'  multiplications 
and  3 XL-  additions  are  needed  to  compute  an  I  x  T  image. 

In  this  paper,  we  propose  a  modification  in  the  linear  interpolation- 
backprojection  stage  which  reduces  the  number  of  multiplications 
by  about  50%,  with  a  slight  increase  in  the  number  of  additions. 
This  may  result  in  a  significant  savings  in  computational  time  for 
processors  which  perform  multiplications  slower  than  additions.  It 
will  also  result  in  reduced  chip  area  in  a  VLSI  implementation,  since 
multipliers  consume  more  chip  area  than  adders. 

There  are  a  number  of  fast  backprojection  techniques,  e.g.  [2],  [3], 
and  [4],  which  reduce  the  number  of  operations  even  more  than  our 
technique.  In  the  next  section,  we  first  present  the  basic  idea  and  then 
compare  our  technique  with  other  existing  fast  algorithms.  In  Section 
III,  we  discuss  some  implementation  issues  and  give  examples.  A 
generalization  of  the  basic  idea  is  presented  in  Section  IV. 


If  we  have  samples  of  7,_  ,  with  a  sampling  interval  of  v  2  1-  2. 

i.e.,  if  we  have  the  discrete  signal 

/?  =  I).  [Mv2]  (11)1 

then  1],  1  (./■.  (/)  can  be  expressed  as 

•h.ii-r.  IJ)  =  FririiXQ',^y,  n,  ]  +  Q[_-,  mj 

where  ri  =  ro  /■_>  and  Ri  —  [it].  However,  using  the  fact  that 

Frin  )  =  Fr(ro  4-  r>)  =  FrlFrir^  I  +  Frfr..  1)  (12) 

equation  ill)  can  be  written  as 

q,.l(x.y)  =Fr(r2)AQ;+v/4(^i)  +  Fr(  ro  )  v/.,  (  Fi  ) 

+  Qi+.v/al-Ri )  +  )  (13) 

where 

0  if  0  <  Fr(ro)  +  Fr(r2)  <  1 
-1  if  1  <  Fr(ro)  +  Fr(r2)  <  2. 

Similarly,  q,,3{x.y)  can  be  written  as  (compare  to  (13)  ) 

q,.3(x.y)  =Fr(r2)AQV  3,v/4(f?.a)  -  •f'f(ro)Ag;+3.v/4(f?3) 

+  QI-E3.\74(^3)  +  ~32^Q[+3.\/^(R3)  (14) 

where  R3  =  [-ro  +  rj]  and 


11.  THE  FAST  ALGORITHM 

If  linear  interpolation  is  used  in  (3),  then  q,{r}  is  approximated  as 
liiir)  =  QaR)  +  Fr{r)XQ,{R)  (4) 

where  R  =  [r]  is  the  largest  integer  less  than  r,  Fr(r)  =  r  —  R 
is  the  fractional  part  of  r,  and  the  A  opierator  acting  on  a  discrete 
function  is  defined  as  Aflk)  =  f{k  +  1)  -  fik). 

We  now  show  that  if  we  backproject  four  views,  namely  the  views 
numbered  /.  /  -I-  .3/4.  i  4-  .3/2,  and  i  +  3.V/4,  then  multiplications 
by  Frir)  above  can  be  combined. 

For  notational  convenience,  let  us  denote 

q,,j{x.y)  =  Y  =  xcosiS,  +  ^j)  +  ysia{d,  4- 


7  =0.1, 2. 3  (5) 


and  rewrite  (3)  as 


V/4  3 


(6) 


<=1  3=0 


Let  us  explicitly  write  q,,o{x.  y)  and  q,.2(x,  y)  in  terms  of  filtered 
projections  Q,(R]  as 

2,.o(j.  (/)  =  Fr(ro)AQ.(i?o)  +  (5.{i?o)  (7) 

where  Ro  —  [ro]  and  ro  =  Jtcos#,  +ysm9,.  Similarly, 

q,.2ix.  y)  =  F r(r2)AQ,^.V/2(f?2)  +  Q.+.V/2(f?2)  (8) 

where  rj  =  — .rsinO,  -1-  ycos9,.  We  find  that  to  compute  q,,o(x,y) 
and  q,  2(x.y),  we  need  to  perform  two  multiplications,  one  by 
Fr(ro)  and  another  by  Friro). 


if  0  <  -Fr(ro)  4-  Fr{r2)  <  1 
0  if  — 1  <  -Fr(ro)  4- Fr{r2  )  <  0. 


Considering  (7),  (8),  (13),  and  (14),  we  find  that  q,.j{x.y) 
can  be  computed  using  2  multiplications,  2  if  statements  and  an 
average  of  16  additions,  whereas  the  classical  formula  for  linear 
interpolation-backprojection  requires  4  multiplications  and  12  addi¬ 
tions  (excluding  integer  additions  for  both  cases). 

The  basic  symmetry  incorporated  in  the  algorithm  is  shown  in 
Fig.  1,  where  we  consider  the  contribution  of  views  numbered 
1.  .3  /4  4- 1  and  X/2  4- 1  to  the  reconstruction  of  a  point  P.  Vertical, 
horizontal  and  diagonal  lines  show,  respectively,  the  points  where  the 
backprojections  of  views  1.  X/2  4- 1  and  X/4  4- 1  are  known  without 
need  for  interpolation.  Notice  that  the  spacing  between  vertical  and 
horizontal  lines  is  1,  whereas  the  spacing  between  diagonal  lines  is 
\/2/2.  If  P  is  in  the  triangle  ABC,  then  Fr(ro)  4-  Frir^)  <  1 
and  we  see  from  the  figure  that  Fr(ri)  =  Fr(ro)  4-  Fr(r2). 
Hence,  q,.i{x.  y)  in  (11)  can  be  expressed  using  Fr(ro)  and  Fr(r2). 
A  similar  argument  applies  when  P  is  in  the  triangle  BCD  and 
1  <  Fr(ro)  4- Fr{r2)  <  2. 

We  now  compare  our  algorithm  to  some  of  the  existing  fast 
backprojection  algorithms.  In  [3],  an  incremental  algorithm  is  used  to 
reduce  the  number  of  multiplications  to  0{XM)  and  the  computation 
run-time  is  improved  by  a  factor  of  1.86  (or  4.43  depending  on  the 
processor)  for  a  127  x  127  image.  However,  for  efficient  implementa¬ 
tion,  (3)  requires  a  complex  search  flow  algorithm  whereas  ours  does 
not.  In  [2],  linograms  are  used  to  reduce  the  number  of  operations  to 
OiXMlogM)  and  a  speed-up  factor  of  2  to  3  is  reported.  However, 
note  that  [2]  requires  the  projections  to  be  on  a  nonuniform  grid  in 
the  (r.O)  plane.  Finally,  [4]  reports  a  speed-up  factor  of  1.2  for  a 
128 X  128  image;  their  speed-up  factor  varies  proportionally  to  image 
size  and  is  less  than  1  for  smaller  images. 
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Fig.  3.  Reconstructed  image  using  the  new  algorithm  with  linear 
interpolation. 


‘he  chirp  z- 

ransforni  (CZT)  [5].  For  an  tnterpolarion  function  with  few  ripples 
we  use  a  technique  described  in  detail  in  [4],  Specifically,  we  lit 


Q'ik) 


Z  ^  __v;, . . .  1 

R=0  '  ' 


and  compute 


Q'(R)  = 


F(k) 

M 


k=-M  ' 


(16) 


Fig.  2.  Reconstructed  image  using  classical  FBP,  windowed  between  1  00 
and  1.04. 


III.  COMPUTATION  OF  Q'{R)  AND  EXAMPLES 

In  order  to  be  able  to  implement  (13)  and  (14),  we  need  t( 
resort  to  some  kind  of  interpolation  to  compute  Q'(R)  from  Q(R 
(in  the  sequel,  for  simplicity,  we  drop  the  subscript  i).  Manv 
interpolation  methods  for  computing  Q'{R),  such  as  bandlimited 

require  far  fewer  operations 
than  0{.\L  );  hence  the  computational  overhead  will  be  small  In 
this  section,  we  discuss  linear  and  bandlimited  interpolation  and  give 


Linear  Interpolation 

Linear  interpolation  of  Q{R)  ,o  obtain  Q'[R)  is  very  simple  a 

wUrnot”!!  ^  about  -  multiplications.  The  resulting  algoritt 
ill  not  be  completely  equivalent  to  conventional  FBP,  due 
e  distortions  caused  by  linear  interpolation.  However,  comput 
images  are  indistinguishable.  Fig.  2  shows  the  result  of  applyi, 
t  e  conventional  FBP  [6]  to  the  Shepp-Logan  phantom  of 


where  F{k)  is  a  half-band  filter,  for  example  F{k)  = 
•  ^  (16)  can  be  computed  using  the  CZT  The 

^tional  computation  required  for  this  interpolation  is  one  M-point 
(15)  and  two  (v/2  -F  2)Af  FFTs  to  implement  (16).  TTie 
overall  additional  cost  is  again  much  smaller  than  a  typical 

image  for  which  L  «  M.  The  result  of  our  new  algorithm  using 
bandlimited  interpolation  is  shown  in  Fig.  4. 


IV.  generalization  to  more  than  FOUR  VIEWS 

®l8onthm  uses  the  fact  that  views  i,i  +  V/4.i  -F  N/2  and 
i+d^/4  can  be  backprojected  together.  In  this  seaion,  we  generalize 
our  Idea  to  backproject  more  than  four  views  together. 

(^nsider  the  backprojection  of  a  view  at  an  angle  6,,  where 

I  -  i  +  A9.  The  contribution  of  this  view  to  the  reconstructed 
image  is 


ciU.y)  _  .^.9/(xcos^,  -Fjisin^,)  =  y?/('-o  cos  A^-Frj  sin 

(17) 

Let  tan  AS  =  |  where  .4  and  B  are  small  integers,  (e  g. 
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V.  CONCLUSION 

We  have  presented  an  algorithm  which,  by  operating  on  4  views 
together,  saves  50*^  of  multiplications  in  the  linear  interpolation  and 
backprojection  stage  of  the  FBP  method.  Unlike  the  method  suagesteJ 
in  [4|  which  uses  an  approximation  to  the  trigonometric  functions, 
this  algorithm  reduces  the  number  of  multiplications  even  for  small 
M  and  L.  The  resulting  image  is  virtually  identical  to  that  obtained 
by  classical  FBP. 

We  have  also  generalized  the  algorithm  to  backproject  more  than  4 
views  together.  This  generalized  version,  however,  mav  increase  the 
number  of  additions  disproportionately,  and  requires  nonuniformlv 
spaced  angular  sampling. 
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Fig.  4.  Reconstructed  image  using  the  new  algorithm  with  bandlimited 
interpolation. 


A.  B  £  { -2.  -1. 1.  2}),  and  let  us  write  (17)  as 


ciiJ-'.y) 


T 


/  cos  Ad 


[Btq  +  .Arj)  . 


(18) 


Repeating  the  same  reasoning  used  in  Section  II,  if  we  have 
samples  of  qi  with  a  sampling  interval  of  [  |,  then  we  can 

interpolate 
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'It 


cos  Ad 


[Bro  +  .Ar2} 


using  a  multiplication  by 


Fr[Bro  +  -Ars  )  =  Fr{  B  Fr{ro  )  +  .AFrirz)) 

=  BFr{ro)  +  AFr(r2)  —  C  (19) 


where  C  -  [BFr{ro)  +  .4Fr(r2)]. 

Since  they  are  small  integers,  multiplications  by  ,4,  B.  and  C  can 
be  carried  out  using  a  few  additions  and  if  statements,  so  that  the  only 
multiplications  in  the  computation  of  ci(x.y)  will  be  by  Fr{ro)  and 
Frirj).  Hence  we  can  combine  the  backprojection  of  the  angle  di 
with  that  of  d,  and  v/2,  factoring  and  saving  the  multiplications 
that  would  be  required  to  backproject  one  view.  Depending  on  the 
ratio  of  the  addition/multiplication  time  or  area  of  the  processor,  this 
may  represent  some  savings  in  the  computation  time  or  chip  area 
required. 

Unfortunately,  in  a  “natural”  angular  sampling  scheme,  angular 
increments  are  uniform,  so  that  the  only  possible  angles  satisfying 
ta.nAd  =  ^  are  Afl  =  ^  or  combining 

interpolation  and  backprojection  of  more  than  four  views  will  be 
applicable  only  if  angular  sampling  is  done  using  a  non-uniform  Ad 
or  if  angular  interpolation  is  performed. 
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Abstract 

In  the  filtered  backprojection  procedure  for  image  recon¬ 
struction  from  projections,  backprojection  dominates  the 
computation  time.  We  propose  a  simple  algorithm  which 
reduces  the  number  of  multiplications  in  linear  interpola¬ 
tion  and  backprojection  stage  by  50%,  with  a  small  in¬ 
crease  in  the  number  of  additions.  The  algorithm  per¬ 
forms  the  interpolation  and  backprojection  of  four  views 
together.  Examples  of  implementation  are  given  and  ex¬ 
tension  to  more  than  four  views  is  discussed. 


I.  Introduction 


The  basic  problem  of  x-ray  tomography  is  to  reconstruct 
an  image  fi{x,y)  from  its  projections  p{r,6)  where 


P(r,d) 


p{x,y)8(r  -  X  cos6  —  yain6)dxdy  (1) 


is  the  Radon  transform  of  p{x,y) 

The  most  common  procedure  for  image  reconstruction 
from  projections  is  filtered  backprojection  (EBP),  in  which 
the  reconstruction  is  carried  out  in  two  stages.  The  first 
stage  is  filtering,  in  which  the  projections  are  filtered  by  a 
modified  ramp  filter  to  yield  the  filtered  projections  g(r,  6) 
[1].  The  second  stage  is  backprojection,  in  which  the  de¬ 
sired  image  p{x,  y)  is  obtained  from  filtered  projections  by 
backprojection. 

In  practical  problems,  we  have  only  samples  of  p(r,  tf). 
Let  us  denote 


Pi{R)=p{r,9i)\r  =  Rd,9,=Ti/N, 

R  =  0,...,M  -I,  (2) 

where  N  is  the  total  number  of  views,  M  is  the  number  of 
samples  in  each  view,  and  d  is  the  radi3d  sampling  distance. 
Similarly,  let  Qi{R)  denote  the  discrete  filtered  projec¬ 
tions,  obtained  by  filtering  Pi{R)  using  a  modified  ramp 
filter.  Then  the  backprojection  operation  is 

N 

Pix,y)  =  —  ^9.(icos«, -l-ysine,)  (3) 

1  =  1 

where  9,(r)  =  q(r,  0,)  is  obtained  from  Qi{R)  by  some  kind 
of  interpolation. 

In  the  EBP  method,  backprojection  dominates  the  com¬ 
putation  time.  The  filtering  stage  can  be  done  us¬ 
ing  the  East  Eourier  Transform  (EET),  which  requires 

*Thi»  work  was  supported  by  the  Office  of  Naval  Research  under 
grant  #N00014-90-J-1897 


0{N Mlog^M)  operations.  On  the  other  hand,  in  the  back- 
projection  stage,  interpolation  needs  to  be  done  for  every 
X,  y,  and  i  in  (3);  therefore  if  linear  interpolation  (requiring 
1  multiplication  per  interpolated  point)  is  used,  then  N 
multiplications  and  ZN additions  are  needed  to  compute 
an  L  X  Z,  image. 

In  this  paper,  we  propose  a  modification  in  the  linear 
interpolation-backprojection  stage  which  reduces  the  num¬ 
ber  of  multiplications  by  about  50%,  with  a  slight  increase 
in  the  number  of  additions.  This  may  result  in  a  signif¬ 
icant  savings  in  computational  time  for  processors  which 
perform  multiplications  slower  than  additions.  It  will  also 
result  in  reduced  chip  area  in  a  VLSI  implementation,  since 
multipliers  consume  more  chip  area  than  adders. 

In  the  next  section,  the  basic  idea  is  presented.  In  Sec¬ 
tion  3,  we  discuss  some  implementation  issues  and  give 
examples.  A  generalization  of  the  basic  idea  is  presented 
in  Section  4. 

II.  The  Fast  Algorithm 

If  linear  interpolation  is  used  in  (3),  then  9,(r)  is  approxi¬ 
mated  as: 


9,(r)  =  Qi(R)  +  fr(r)AQ.(R),  (4) 

where  R  =  [r]  is  the  largest  integer  less  than  r,  Fr(r)  = 
r  —  R  \s  the  fractional  part  of  r,  and  the  A  operator  acting 
on  a  discrete  function  is  defined  as  A/(ir)  =  — 

We  now  show  that  if  we  backproject  four  views,  namely 
the  views  numbered  z,  i  +  N/A,  i+Nf2,  and  i-t-3iV/4,  then 
multiplications  by  Fr(r)  above  can  be  combined. 

Eor  notational  convenience,  let  us  denote 

g,  j(x,y)=  (r  =  icos(^i  +  jj)  -I-  ysin(0i  -f  jj))  , 

;  =  0,1,2,3,  (5) 


and  rewrite  (3)  as 

IV/4  3 
>=0 

Let  us  explicitly  write  9i,o(®iy)  and  9, ,2(^,1/)  in  terms  of 
the  given  filtered  projections  Qi{R)  as 

gi,o(*>  y)  =  Fr(ro)AQi{Ro)  +  Qi{Ro),  C^) 

where  Ro  =  [ro]  and  tq  =  x  cos  6i  -t-  ysinSj.  Similarly, 

qi,2{x,  y)  =  Fr{r2)AQi+N/2iR2)  +  Q.+w/zC-Rz),  (8) 

where  r2  =  — xsin0,  -(-  ycosA'.  We  find  that  to  compute 
9i  o(x,y)  and  qi  2{x,y),  we  need  to  perform  two  multipli¬ 
cations,  one  by  Fr(ro)  and  another  by  Fr(r2). 
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jjow  express  g,,i(x,  y)  and  g.,3(x,  y)  using  the  same 
Apply'"^  cosine  addition  formulas 


1 


to 


V2 

q,.i{^<y'>  =  9'+^/4(''  =  -^('•o  +  r2)) 


B.  Bandlimited  Interpolation 

Bandlimited  interpolation  can  be  performed  using  the 
Chirp  z-transform  (CZT)  [2],  Specifically,  if  r. 

0,  1  is  the  discrete  Fourier  transform  of  Q{R), 

then  we  let 


Ifvi-e  have 
2(i'/2,  ‘-e-. 


samples  of  q^+N/A  with  a  sampling  interval  of 
if  we  have  the  discrete  signal 


=  ?,(r)UHi/y2’  -R  =  0, 1,2, ,  .  [A/v/2]  (10) 


.  j(j,y)  can  be  expressed  as 
'.^(,,y)  =  Fr(n)Ag; 

+  A'/4('^l)  +  Qi  +  N/'l('^l)  (^1) 

=  To  +  f2  and  Ri  =  [rj].  However,  using 
Fr(ri)  =  +  rz)  =  Fr(Fr(ro)  +  Fr(r2)),  (12) 

uation  (11)  can  be  written  as 


+C<+^/4(^i)  +  Ti  Ag'^^y^(/Ei)  (13) 


lere 

_/  0  if  0  < /’r(ro)  +  Fr(r2)  <  1 

\  -1  if  1  <  Friro)  +  Fr(r2)  <  2. 

imilarly,  9i,3(*.  y)  can  be  written  as  (compare  to  (13)  ) 

3(^,!/)—FH^2)^Qi+3iv/4(B3)  ~  Fr(ro)AQ^_^^j^^^(R3) 

+<5<+3JV/4(-^3)  +  73^Q'i+3ff/4(R3)  (14) 

ere  Rs  =  [“''o  +  r’z]  and 

_  r  1  if  0  <  -Fr(ro)  +  Frirj)  <  1 

(  0  if  -1  <  -Fr(ro)  +  Fr(r2)  <  0. 

insidering  (7),  (8)  (13),  and  (14),  E^_o9ij(x,y)  can 
romputed  using  2  multiplications,  2  if  statements  and 
average  of  16  additions;  the  classical  formula  for  linear 
rpolation-backprojection  requires  4  multiplications  and 
dditions  (excluding  integer  additions  for  both  cases). 


iQW 

k  =  M/2 

Qik) 

k-Q,....  M/2  -  1 

Q{M  -  k) 

k  =  -\,....-M/2+  1 

\Q(.M  -  k) 

k  =  -M/2 

and  compute 

M/2 

«'('*>  =3?  E  (16) 

i  =  -A//2 

for  72  =  0, ... ,  [{M  —  l)\/2],  which  can  be  done  using  the 
CZT  [2].  This  corresponds  to  the  interpolation 

=  (17) 

Using  the  CZT,  this  procedure  requires  the  computation 
of  a  single  M'  =  [M(v/2+l)+l]-point  inverse  FFTfor  each 
view  for  the  convolution.  (Q(ifc)  is  already  available  from 
the  filtering  stage.)  This  requires  roughly  yAf'log2M' 
multiplications,  which  is  usually  much  smaller  than  N 
for  a  typical  image  for  which  L  M . 

Although  bandlimited  interpolation  is  attractive  and  sim¬ 
ple  to  implement,  it  has  a  major  drawback:  The  interpola¬ 
tion  function  A(x)  has  too  many  ripples.  These  ripples  can 
be  reduced  using  a  smoothing  function  in  the  frequency 
domain,  but  at  the  expense  of  spatial  resolution.  Figure  3 
shows  the  result  of  applying  our  new  algorithm  with  ban- 
dlimited  interpolation.  Notice  that  the  ripples  caused  by 
the  high-density  skull  area  go  deep  inside  the  picture. 

From  this,  we  conclude  that  linear  interpolation  is  faster, 
and  seems  to  give  better  results. 


Computation  of  Q'(R)  and  Examples  rV.  A  Generalization 


)e  able  to  implement  (13)  and  (14),  we  need  to  re- 
to  some  kind  of  interpolation  to  compute  Q'{R)  from 
I  (In  the  sequel,  for  simplicity,  we  drop  the  subscript 
lany  interpolation  methods  for  computing  Q'(R),  such 
andlimited,  spline,  or  Lagrange  interpolation,  will  re- 
;  far  fewer  operations  than  O(NL^);  hence  the  compu- 
na!  overhead  will  be  small.  In  this  section,  we  discuss 
r  and  bandlimited  interpolation;  and  give  examples. 

Linear  Interpolation 

if  interpolation  of  Q{R)  to  obtain  Q'{R)  is  very  simple 
equires  only  about  yA/v/2  multiplications.  Figure  1 
s  the  result  of  applying  the  conventional  FBP  [3]  to 
bepp- Logan  phantom  of  [3],  and  Figure  2  shows  the 
of  applying  our  new  algorithm  using  linear  interpo- 
1  The  results  are  indistinguishable. 


Our  algorithm  uses  the  fact  that  views  i,  i  +  N/4,  i  +  N/2 
and  j  -f  37V/4  can  be  backprojected  together.  In  this  sec¬ 
tion,  we  generalize  our  idea  to  backproject  more  than  four 
views  together. 

Consider  the  backprojection  of  a  view  at  an  angle  di, 
where  6t  =  -f  A6.  The  contribution  of  this  view  to  the 
reconstructed  image  is 

ci(3^.y)=^9/(a:  cos^j  -f  ysin^j) 

=  ^g/(ro  cos  Atf -F  rzsin  A^).  (18) 

Let  tcin  A6  =  A  where  A  and  B  are  small  integers,  (e.g. 
A,B  E  {—2,  —1, 1, 2}),  and  let  us  write  (18)  as 

C)(x,y)  =  ^9,  ^^^^(Bro-l-Ar2)j  .  (19) 
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Repeating  the  reasoning  used  in  Section  2,  if  we  have  sam¬ 
ples  of  qi  with  a  sampling  interval  of  |  |,  then  we  can 

interpolate  I  Btq  +  Ar2)]  using  a  multiplication  by 

Fr(Bra  -f  .Arn  )  =  Fr(BFr[rQ)  AFrir^)) 

=  B  F  r[r-/i  +  AFrir^)  —  C,  (20) 

where  C  =  [BFrir.-/}  +  AFr{r2)]. 

Since  A.  B,  and  C  are  small  integers,  multiplications  by 
these  integers  can  be  carried  out  using  a  few  additions  and 
if  statements,  so  that  the  only  multiplications  in  the  com¬ 
putation  of  ci{x,y)  will  be  by  Fr(rci)  and  Fr(r2).  Hence 
we  can  combine  the  backprojection  of  the  angle  9i  with 
that  of  6t  and  factoring  and  saving  the  multiplica¬ 

tions  that  would  be  required  to  backproject  one  view  De¬ 
pending  on  the  ratio  of  the  addition/multiplication  time 
or  area  of  the  processor,  this  may  represent  some  savings 
in  the  computation  time  or  chip  area  required. 

Unfortunately,  in  a  ’’natural”  angular  sampling  scheme, 
angular  increments  are  uniform,  so  that  the  only  possible 
angles  satisfying  tan  Afl  =  ^  are  =  |-  or  A5  =  ^.  The 
idea  of  combining  interpolation  and  backprojection  of  more 
than  four  views  will  be  applicable  only  if  angular  sampling 
is  done  using  a  non-uniform  Ad  or  if  angular  interpolation 
is  performed. 

V.  Conclusion 

We  have  presented  an  algorithm  which,  by  operating  on  4 
views  together,  saves  50%  of  multiplications  in  the  linear 
interpolation  and  backprojection  stage  of  the  FBP  method. 
Unlike  the  method  suggested  in  [4]  which  uses  an  approx¬ 
imation  to  the  trigonometric  functions,  this  algorithm  re¬ 
duces  the  number  of  multiplications  even  for  small  M  and 
L.  The  resulting  image  is  virtually  identical  to  that  ob¬ 
tained  by  classical  FBP. 

We  have  also  generalized  the  algorithm  to  backproject 
more  than  4  views  together.  This  generalized  version,  how¬ 
ever,  may  increase  the  number  of  additions  disproportion¬ 
ately,  and  requires  non-uniformly  spaced  angular  sampling. 
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Figure  1:  Reconstructed  image  using  classical  FBP,  windowed 
between  1.00  and  1.04. 


Figure  2:  Reconstructed  image  using  the  new  adgorithm  with 
linear  interpolation,  windowed  between  1.00  and  1.04. 


Figure  3:  Reconstructed  image  using  the  new  algorithm  with 
bandlimited  interpolation,  windowed  between  1.00  and  1  06 
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APPENDIX  M 

B.  Sahiner  and  A.E.  Yagle,  “Limited  Angle  Tomography  Using  the  Wavelet 
Transform,”  revision  submitted  to  IEEE  Trans.  Image  Proc.,  October  1993. 

This  paper  shows  that  when  the  extent  of  missing  angles  is  small  in  limited-angle 
tomography,  two  of  the  three  sets  of  detail  images  in  the  wavelet  transform  are  unaffected, 
and  low- resolution  images  can  be  obtained  by  interpolation.  Using  some  a  priori  partial 
information  on  edges  parallel  to  the  missing  angles,  we  have  developed  a  wavelet- domain 
algorithm  for  restoring  the  image. 


LIMITED  ANGLE  TOMOGRAPHY  USING  THE  WAVELET 


TRANSFORM 

Bprkinaii  Sahiiipr  and  Aiulrpw  E.  'lasilp 
I)ppt.  of  Elpc'triral  Ensiiippriiis,  and  Coiiipiitpr  Scipiicp 
Flip  Cuivprsiry  (if  Micdusan.  Ami  .Vrhor.  Michigan  4^10!) 

Abstract 

W'p  investigate  the  problem  of  limited  angle  tomography  when  t liere  exists  a  ]iriori  knowledge  alemt 
edges  tliat  lie  parallel  to  missing  view  angles.  We  fill  tlie  missing  high-fre(|iiency  regions  in  the  Fourier 
plane  using  the  edge  knowledge,  and  the  low-frequency  regions  using  a  simple  interpolation  procetliire. 
We  characterize  the  edges,  and  bring  together  the  lowpass  and  highpass  images_,  by  using  the  wavelet 
transform. 


I.  INTRODUCTION 

TliP  limited  ano'le  problem  in  tomography  has  received  a  lot  of  attention  in  recent  years  due  to  its 
applications  in  medical  imaging,  astronomy,  electron  microscopy,  industry,  etc.  [1.  2].  One  of  the  most 
important  tools  in  limited  angle  reconstruction  is  a  priori  information  about  the  image,  which  may  be  in 
the  form  of  limited  spatial  e.xtent  of  the  image,  upper  and  lower  bounds  on  reconstructed  pi.xel  values, 
energy  constraints, or  closeness  to  a  known  image  [3.  4],  In  this  correspondence,  we  investigate  the  problem 
of  image  restoration  when  the  a  priori  knowledge  is  the  e.xistence  of  edges  that  lie  parallel  to  the  missing 
view  angles. 

To  define  the  limited  angle  problem,  we  first  review  the  use  of  complete  data  in  tomography.  The 
data  consist  of  line  integrals  p/(r„i,0„)  of  an  image  f{x,y),  which  are  defined  as 


/oc  roo 

/  f{x.y)6{r„,  -  xcosf^n  -  y  sin  0n)dxdy 

“OO  J  ^OO 


1 


where  9,i  —  (n  +  0.5)7r/.'V,  =  0, 1, ....  :V  —  1  and  r„;  =  Ar{m  —  M/2),  77i  =  0.  1. . . . .  .V/  —  1.  It  is  assumed 

that  Ar.  M  and  A’  have  been  chosen  appropriately  to  obtain  an  accurate  reconstruction  friiiJi)  using 
filtered  backprojection.  The  filtered  backprojection  equation  is 


A'-i 


friil  •  1-2  )  = 


n=:0 


EDK'S  .N'umber:  6..'i.l. 


Phone:41.3-76:?-98U) 


1 


whprp  r/,,  ( r )  is  (^htaiiiPcl  hy  roiivolving  p/(r„i.6^,J  with  a  ramp  or  modifipcl  ramp  function  i'i. 

W’p  now  (Ipfinp  tlip  limifpt!  an.^le  itmhlpm.  W’p  assuiiip.  without  loss  of  2;pnpralify.  'hu'  - 

cpntprpcl  around  tr/'J  arp  missins;  in  tlm  projpction  data.  r(.)rrpspondinp  to  a  missino  anylp  pxioni  nf 

ftjuiss  =  ■  riip  availaldp  data  arp  thim  p  r,,,  .ffr,  )- "  =  0 . -r  —  ^  —  i .  ^  +  L . \  -  1.  Tlu' 

projpction-^licp  tliporpin  ^tatPs  that  thp  Foiiripr  transform  Fr{tr.O,,)  of  Pf{r,,^.0,/j  is  pcpial  to  tlm  Foiiripr 
transform  /■'( //y  .(/'_>  i  of  tlip  imagp  fl.r.y)  along;  a  slicp  in  thp  Foiiripr  plan?  that  passes  throimli  ilip 
origin  and  makes  an  angle  with  the  (C]  axis.  Thus,  the  Fourier  transform  of  the  image  is  known  on  the 
concentric  circles  grid  shown  in  Figure  1.  exce])t  for  the  bowtie-like  region  R.  The  quality  of  rpconstruction 
from  missing  views  depends  on  how  well  we  can  fill  in  the  missing  Fourier  transform  samples  in  R.  It  has 
recently  been  shown  [6]  that  the  squashing  algorithm  [7]  is  equivalent  to  setting  frequency  samples  in  R 
to  zero.  Oskoui  and  Stark  have  used  an  interpolation  technique  to  fill  R  [8].  however,  results  indicate  that 
their  interjtolation  does  not  improve  over  squashing.  Still  another  method  is  projections  onto  convex  sets 
(PO(’S).  where  the  convex  sets  are  defined  by  the  given  projections  and  the  a  priori  knowledge  about 
the  image  [9,  3].  These  methods  have  been  compared  in  [8]. 

In  this  correspondence,  we  propose  to  fill  the  high-frequency  regions  of  R  using  knowledge  of  edges 
which  lie  parallel  to  the  x  a.xis.  and  to  fiU  the  low-frequency  regions  of  R  using  a  simple  interpolation 
technique.  We  characterize  the  edges,  and  bring  together  the  lowpass  and  highpass  regions  of  R.  by  using 
the  wavelet  transform. 

This  correspondence  makes  four  contributions:  ( 1 )  We  give  a  new  interpretation  of  why  edges  in  certain 
directions  cause  artifacts  in  their  vicinity  and  are  blurred,  while  other  edges  are  not.  (2)  We  give  a  new 
interpolation  procedure  to  obtain  a  low-resolution  image;  this  isolates  directional  high-resolution  images 
as  missing  information.  (3)  With  the  use  of  a  priori  edge  information  to  complete  the  reconstruction, 
we  not  only  sharpen  the  edges  but  also  eliminate  the  artifacts  caused  by  these  edges.  (4)  We  present  a 
number  of  numerical  examples  with  two  different  phantoms,  different  wavelet  bases,  and  different  number 
of  scales  in  the  wavelet  representation. 

In  Section  II.  we  review  the  wavelet  transform  and  show  that  if  Orniss  is  small  enough,  then  two  of 
the  three  sets  of  detail  images  in  the  wavelet  transform  will  be  largely  unaffected.  This  provides  a  new 
explanation  of  why  edges  orthogonal  to  the  missing  view  angles  are  reconstructed  well,  while  edges  paraUel 
to  the  missing  angles  are  blurred.  We  use  the  edge  information  in  the  remaining  direction  to  partly  fill 
in  the  affected  detail  image.  .4  recent  publication  [10]  has  shown  that  it  is  possible  to  reconstnirf  an 
image  from  the  values  and  location  of  the  ma.xima  of  the  wavelet  transform,  which  roughly  rorrespond 


9 


2 


to  rlip  locarion  of  rlip  pclgps.  In  nnr  rase,  if  tlip  p(le,'ps  parallel  to  the  x-axi>  are  kiioxn.  ami  'iip  i|po 
iina2,Pa  ill  (!ia”onal  ami  y-axis  direrfinns  are  unaffprted.  it  i.-,  nor  >urprisi!i2;  tliat  rim  or!‘2.imd  a,,;;  : 

rpcoepred ,  In  Spctinn  III.  wp  disnis.s  the  interpolation  technicpie  tliat  \vp  use  to  fill  in  thr  iov.pa"  rpam 
of  //.  In  Spat  ion  I\  .  \vp  present  oiir  algoritiim  and  e:ivp  nuiiieriral  pxainplps. 


II.  THE  WAVELET  TRAXSEORM 


In  this  rorrespondenre.  wp  restrirt  attention  to  discrete  dyadic  wavelet  t ransforins.  For  a  l-I)  aisnal  /'(/). 
we  use  the  followina;  definition  for  tlie  wavelet  transform,  wliich  is  described  in  more  detail  in  lOh 

X/(t)  =  A^-). 

■W+i/('')  =  S2J{i)*  0  <  j  <  J.  (3) 

.At  each  scale  j,  the  algorithm  decompo,ses  .ST/  into  a  detail  signal  HT+‘  ^^d  an  average  signal  .ST+i. 
The  filters  Hj(i)  and  Gj(i)  are  obtained  by  inserting  2-^  -  1  zeros  between  the  coefficients  of  Ho{i)  and 
(’o(i).  which  means  that  their  Fourier  transforms  satisfy  Hj{w)  =  Ho{'2-'u')  and  (ij{  w)  =  Gq[2-'w).  Hq(  »■) 
is  a  low-pass  filter  with  HoiO)  =  1  and  6'o(u’)  is  a  highpass  filter  with  G'o(O)  =  0.  Note  that  unlike  the 
definitions  given  in  [11]  or  [12].  this  definition  of  the  wavelet  transform  is  redundant.  This  redundancy  is 
useful  when  the  purpose  of  the  wavelet  transform  is  characterization  or  representation  of  sharp  transitions 
in  the  signal. 

If  l\(u')  is  a  filter  that  satisfies  K{w)Ct(w)  +  =  1,  then  a  perfect  reconstruction  from  the 

wavelet  transform  is  given  by 

S-zj-ifii)  -  WT/(R  +  +  S-ijfii)  *  (4) 

where  =  H(-2^~^i)  and  =  /F(2-'”R). 

In  this  paper,  we  choose  to  use  G{i)  =  (-1)‘//(1  -  i)  and  K(i)  =  which  implies  that 

+  |//(u’ +  7r)|'-^  =  1.  (.')) 

Filters  that  satisfy  (.3)  are  called  quadrature  mirror  filters:  orthogonal  wavelets  are  a  special  class  of 
(piadrature  mirror  filters. 
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Fur  a  ■_’-[)  signal  wp  dpfirip  the  wavplpt  transform  as 


l,+  i  /( '1  ■  )  =  ■'’jj  /I  M  •  '2)  *  (’A  >\)  *  H  ,{i :)  0  <  j  <  •/. 

H  /( )  =  .S'.j:  /i  /, .  pj  ]  *  H  ,{i\]  *  ( r',(  Pi )  0  <  j  <  ./. 

11  y,^l  f'\  I].  I  2  )  =  A'-.J  f{  i  1  .  Pj  I  *  f  /’  ,  (  /  1  )  *  ^ '  ;  (  '  )  0  <  7  <  ./. 

■Vjj,  ;  / (  /,  .  P2  )  =  .So  ,  /(  .  I  ,  )  *  f/j(  M  I  »  Hj{i2  )  0  <  J  <  ./. 

If  //  satisfips  {.')).  and  witli  tliP  rhoicp  of  (7(1)  =  (-1)'’//(1  -  /).  the  rpronstruction  algoritlmi  in 

■“’’iJ-'  /(  '1  •  '2  )  =  2J  V(  M  ■  '2)  +  <"';(  '1  )  *  ^ji‘2)  +  2J^  f{>\  -  >2]  *  a  .{  1 2  ) 

+  ^^2}  */(  M  •  *2  )  *  )  *  ^332  )  +  ■'^21  fi  h-  A2)  *  H  j{!i)  *  Hj{  12). 


!  d  ' 


TliP  idpal  partitioning  of  thp  Fouripr  planp  with  thp  filtprs  H  and  G  is  shown  in  Figure  2.  If  H  and  G 
ai'p  ideal  lowpass  and  highpass  filters,  and  6*„i,5s  <  2tan'|  36.8'^.  then  we  observe  from  Figure  2  that 
!j*/(M-d2)  and  • '2)  v.''U  not  be  affected.  This  is  a  new  e.xplanation  of  why  edges  not  parallel 

to  the  missing  views  are  relatively  unaffected  by  them. 

.4s  long  as  is  not  larger  than  36.8'^.  *2)  wiU  be  affected,  but  will  stiU  carry  information 

about  the  image.  In  fact,  the  missing  information  lost  along  with  the  missing  angles  shows  up  in  the 
effect  on  If  M  ■ '2  )•  Specification  of  the  missing  data  in  Fourier  space^  which  is  global,  can  thus  be 
replaced  with  specification  of  the  missing  data  in  the  local,  directional  wavelet  coefficients  H ij .  io  )• 
Edges  in  the  image  f[i\.  i-i)  correspond  to  high  frequency  components  in  the  Fourier  domain.  However, 
they  can  also  be  viewed  as  localized  high-resolution  components  in  the  wavelet  domain.  In  a  limited  angle 
tomography  problem,  if  the  locations  and  magnitudes  of  some  of  the  edges  that  lie  parallel  to  the  x-axis 
in  the  image  are  known,  then  ,  *2)  can  be  approximated  up  to  some  scale  J  for  values  of  i\  and 

p.)  near  the  known  edges.  However,  since  the  edges  give  us  little  information  about  the  low-resolution 
behavior  of  the  image,  a  different  method  must  be  used  to  fill  this  low-resolution  information,  which 
corresponds  to  lowpass  regions  in  the  Fourier  plane.  A  simple  algorithm  for  this  is  given  is  Section  III. 

If  1^7/1155  <  36.8°  or  if  the  filter  H  is  not  ideal,  aU  three  detail  images  in  the  wavelet  transform  of  the 
image  obtained  from  the  missing  data  will  contain  useful  information;  therefore  we  need  not  throw  away 
one  detail  image  completely  and  replace  it  with  the  approximation  obtained  by  the  edge  information. 
This  problem  is  solved  by  the  use  of  forward  projection  in  Section  IV. 
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III.  IXTERPOLATIOX  OF  LOWPASS  MISSIXG  DATA 


[n  the  Four  IP  r  plaiip.  t  liP  projprt  ion.s  proviclp  ns  with  Fonripr  valiips  on  a  concpiit  rir  ci  reins  ^rid ,  hieli  i|i  if- 
not  havp  iiiiiforiii  sampling  dpiisity.  TIip  saiiipling  dpiisity  is  liiglipst  around  tlip  origin,  and  iha  ipa-p- 
liiiparlv’  as  tlip  fruipiPiicy  magnitudp  inrrpasp;-.  This  is  oiip  ruason  why  a  ramp  filfpr  is  nupded  in  ilip 
eon  volution- hnrkprojprtion  niPthod —  high-frpcpipiicy  data  arp  samplpci  morn  sparsply.  so  eacdi  datum 
must  l)p  wpightpd  iiiorp  hnavily.  On  the  other  liand.  if  tliP  image  /(.r.  //)  is  spatially  limited,  as  is  the  ea-e 
for  many  tomography  problems,  its  Fourier  transform  fli/p.icj)  ran  be  interpolated  using  its  sample- 
on  a  rectangular  grid  which  has  uniform  density  everywhere,  .\lthough  the  ideal  interpolation  function 
here  is  a  sine,  which  has  infinite  support,  in  most  cases  a  few  neighboring  samples  are  sufficient  for 
interpolation.  This  difference  in  sampling  densities  suggests  that  there  may  be  a  high  correlation  among 
the  low  frequency  samples  on  a  concentric  circles  grid. 

-As  an  illustrative  example,  assume  that  f(x.y)  is  spatially  limited  to  a  circle  of  radius  and  assume 
that  the  radial  sampling  interval  for  the  concentric  circles  grid  in  the  Fourier  plane  is  the  Nyquist  interval 
t/T.  Since  f(x.y)  is  spatially  limited,  it  is  not  frequency-limited:  therefore,  strictly  speaking,  we  need 
an  infinite  number  of  samples  per  radial  line  for  exact  representation.  However,  usually  a  finite  number 
of  samples  .1/  is  sufficient.  Also,  assume  that  the  number  of  angular  samples  A  is  equal  to  A/,  giving  u.s 
the  grid  shown  in  Figure  A. 

.According  to  the  sampling  theorem,  F(  uq ,  uq)  can  be  interpolated  in  terms  of  its  samples  F(  4^.  ^ ). 
-Assume  that  we  use  the  nearest  neighbors  in  a  ^  x  ^  square  centered  around  (uq.  w^)  to  interpolate 
F{w\.w-2}.  where  A  is  an  integer.  Then,  to  interpolate  the  Fourier  transform  in  a  square  ^  x 
we  will  need  (A’  +  A)^  samples.  However,  in  the  circle  of  radius  ^  inscribed  in  the  ^  X  square, 
note  that  the  concentric  circles  grid  has  N K  samples.  This  means  that  for  every  sample  point  necessary 
for  interpolation  in  the  rectangular  grid,  there  are  samples  in  the  concentric  circles  grid.  If 

.V  =  A/  =  A’  >  A  then  this  ratio  is  1,  however,  if  A’  =  M/16  >  A,  then  this  ratio  is  16.  This  means 
that  the  sampling  is  redundant:  some  of  these  samples  can  be  interpolated  from  others. 

•Although  a  number  of  interpolation  methods  could  be  used,  we  chose  the  following  simple  linear 
interpolation  procedure; 


Pf(  U\n-0n) 


1 


2L  +  I 


A 


+  L  -  n  \  F/(m„..6'.v/2-L-i )  + 


A 


n  — 


l)j  Pf(w,n.0s/2  +  L) 


where  the  Fourier  transform  of  any  missing  view  in  the  range  of  missing  views  is  replaced  by  a  linear 
combination  Fourier  transforms  of  the  two  views  which  border  to  the  missing  view  range. .Xote  that  since 


tlip  iiitPr])f)lation  is  liiipar.  iiiissins;  [)rojprtion  data  ran  l)p  intpriwlafpcl  siini)l>’  takiim  Mip  inM'i-.' 
Rniripr  transform  of  i.r.,  by  replarins  in  (S)  l)y 

To  tpst  tliP  validity  o(  this  prorpchire.  \vp  trind  it  on  two  test  iiiiagps.  TliP  first  iiuatiP  coii^i-i-  uj 
tln'pp  ppoiiiptrir  shapps,  TIip  sprond  iiiia2;p  is  the  Slippp-Logan  pliantom.  in  which  tliP  ttiaav  hnam  aii' 
chospii  a.^  in  il-'ii:  tliis  is  a  frctpipiitly-uspcl  ])hantoiii  in  liinitpcl  aiiolp  studios  fd.  si.  Tlip  rocon^t  |■llct pii 
imao'ps  from  full  vipws  with  .V  =  M  =  12S  arp  shown  in  P'ipurps  4  and  b.  Tahlp  1  rompai'p- 

tlip  imayps  obtaiiipcl  usina;  tliP  interpolation  formula  (S)  to  the  images  obtained  by  setting  the  unknown 
Fourier  coefficients  to  zero,  as  in  the  squashing  algorithm  [b].  The  number  of  missing  vieww  'IL  is  lb  for 
the  geometric  ])hantom  and  32  for  the  She|)p-Logan  phantom.  The  basis  for  comparison  is  the  percpiii 
root  mean  scpiare  error. 


100% 
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ll/r||2  ■ 

The  first  row  in  Table  1  shows  the  error  for  fhm.  obtained  by  setting  the  unknown  Fourier  coefficients 
to  zero,  and  the  second  row  shows  the  error  for  /p/.  obtained  by  using  the  interpolation  formula  (S). 
The  third  and  fourth  rows  show  the  error  for  .S'2< //tm  and  ,924 }pi  computed  using  the  Haar  basis.  It  is 
observed  that  although  the  interpolation  does  not  result  in  a  dramatic  improvement  over  the  image  itself, 
the  improvement  in  the  lowpass  signal  S2*  f  is  dramatic. 
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IV.  THE  ALGORITHM  AND  NUMERICAL  EXAMPLES 

We  summarize  below  our  algorithm  for  limited  angle  tomographic  reconstruction^  given  the  locations 
and  magnitudes  of  edges  that  lie  parallel  to  missing  view  angles: 

1.  Inter])olate  unknown  projections  from  the  known  ones  using  (8).  .4s  discussed  in  Section  III,  this 
interpolation  will  work  well  at  low  frequencies  but  not  so  well  at  high  frequencies. 


2.  Reconstruct  an  image  /p/( i) ,  12)  from  the  known  and  interpolated  views  using  filtered  backprojection. 

3.  Find  the  wavelet  transform  of  fpi{i\A2)  up  to  some  scale  J  using  (6). 


4.  From  the  knowledge  of  edges  that  lie  parallel  to  ii  axis,  construct  an  image  E{i\,  i-i)  that  has  edges  of 
known  magnitude  at  known  locations,  and  no  edges  elsewhere.  Compute  E{i\A2)A  -  1 . 


').  Reconstruct  an  image  /ep/(fi,  *2)  using  the  inverse  wavelet  transform  from  the  images 
{n-‘;>/p;(:i.:2).Hj’£(Mm2),n^fyp/(t,.t2)-J  =  1....V}  and  S^J Ui{ii./i2)- 
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6.  ('oinpiitp  projections  of  fepiii\-i2)  along  the  missing  views  9,.  = 


to  complete  the  missing  projection  data.  The  completed  set  of  vie\vs  Pf{r.,^.9,  ]  i- 


Pjir,n-^n)  = 


PUii’-m-ff7i)  if  ^.V/e-L  <  <  f'.V/’  +  Z.-l 


Pf[  ■  9,, )  ot  tier  wise. 

Finall\-.  use  the  filtered  liackprojection  algorithm  on  Ff{r„^.9,^)  to  find  tlie  restored  iuiage  /(m-'j)- 

We  now  present  numerical  e.xamples  involving  the  geometric  phantom  of  Figure  4  and  the  Shepp- 
Logan  ])hantom  of  Figure  .').  In  aU  e.xamples.  complete  data  consist  of  12.S  x  12S  projections  and  the 
missing  data  are  centered  around  7r/2. 

Figure  6  shows  M .  12) •  obtained  using  filtered  backprojection  when  16  views  (  L  =  =  22.') ' ) 

are  missing  from  the  projections  of  the  geometric  phantom.  The  missing  views  are  replaced  by  zero  in 
filtered  backprojection.  as  in  the  squashing  algorithm  [6].  We  observe  that  there  is  blurring  and  artifacts 
around  the  edges  which  lie  parallel  to  the  fi  axis.  Figure  7  shows  the  locations  where  we  assume  a  priori 
edge  knowledge.  Figure  8  shows  the  edge  image,  E{iy,i2)  obtained  from  these  locations  and  the  edge  mag¬ 
nitudes.  Figure  9  shows  /ep/lM.fi).  reconstructed  from  fpi{i-[.i2]<  j  = 

1  . 4}  and  S-2*  12)  where  the  wavelet  basis  for  the  decomposition  and  reconstruction  was  chosen 

as  the  Haar  basis.  Figure  10  shows  the  final  restored  image,  We  note  that  not  only  the  edges 

parallel  to  the  i-[  axis  are  now  unblurred  (this  is  natural  because  we  assumed  that  we  knew  the  existence 
of  these  edges),  but  also  the  artifacts  around  the  edges  have  been  significantly  reduced. 

Figure  1 1  shows  fumiU,  ^2)1  for  the  Shepp- Logan  phantom,  wdth  L  =  16.  Figure  12  shows  the  locations 
where  we  assume  a  priori  edge  knowledge,  and  Figure  13  shows  the  final  restored  image,  -A.gain. 

artifacts  have  almost  completely  been  eliminated. 

To  quantify  our  results,  we  again  use  the  percent  root  mean  square  error,  defined  in  Section  III.  Table 

2  summarizes  the  error  for  /  with  three  wavelet  bases  D2  (Haar  basis),  D4  and  D6  defined  in  [12]  and 
for  .7=3.  4.  5,  and  6.  We  observe  that  a  short  wavelet  basis  such  as  D2  or  D4.  and  a  moderate  number 
of  scales  such  as  J  =  3  or  J  =  4,  are  sufficient  to  obtain  good  restoration. 
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Geometric  phantom  (Fig.  4) 

Shepp- Logan  phantom  (Fig.  .5) 

with  16  views  missing 

with  32  views  missing 

£^rms  film 

3«.96 

28.91 

^rms  fpi 

22.76 

16.82 

^ri7is 

29.13 

24.68 

E^rins  ^'2^  fpi 

6.03 

3.87 

Table  1.  Comparison  of  Ems  between  images  obtained  using  interpolation  and  images  obtained  by 

setting  unknown  Fourier  coefficients  to  zero. 
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J  =  4 

bcisis 

J  =  .6 
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J  =  3 
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J  =  5 

J  =  6 

J  =  3 

D4 

J  =  4 

)asis 

J  =  5 

J  =  6 

fieom.  phantom 

11.98 

10.07 

11.73 

12.88 

12.73 

9.89 

11.86 

13.24 

13.09 

9.86 

11.93 

13.35 

.S-L  phantom 

7.08 

6.30 

7.18 

8.98 

7.30 

6.35 

7.06 

9.03 

7.40 

6.42 

7.08 

9.01 

Table  2.  Ermj  for  different  wavelet  ba.ses  and  different  J . 


Figure  Headings 


Fis,.  1.  Ttip  foiirpiitrir  cirrlps  grid  and  the  howtip-like  region  R  over  wliirli  Fourier  translorin  ^.uiipic' 


m 


nii.'-^ing. 


[  i".  '1.  Tlip  ideal  parritioning  of  the  Fourier  plane  with  the  filters  H  anil  (i. 

Fig.  d.  The  roncentrir  rirides  grid  and  the  rertangiilar  grid  for  M  =  A'  =  lo.  Note  that  in.-^ide  the  small 
square  in  the  renter,  the  ronrentric  grid  ha.s  dO  samples  and  the  rectangular  grid  has  9  sainples. 

Fig.  4.  The  geometric  phantom,  recon.structed  from  128  x  128  projections. 

Fig.  .1.  The  Shepp-Logan  phantom,  reconstructed  from  128  x  128  projections. 

Fig.  6.  fiimiU-i-i)  obtained  from  16  missing  view  angles  (Omiss  =  22.5°). 


Fig.  I.  Locations  of  edges  that  are  known  to  be  parallel  to  the  ii  axis 
Fig.  8.  The  edge  image  E(ix.i2)- 
Fig.  9.  /ep/(il.22) 

Fig.  10.  The  final  reconstructed  image  /{t’l.Fi). 

Fig.  11.  //pufi'i. '2)  obtained  from  32  missing  view  angles  =  45°). 

Fig.  12  Locations  of  edges  that  are  known  to  be  parallel  to  the  axis 
Fig.  13.  The  final  reconstructed  image 
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Figure  3 


Figure  5. 


Figure  7. 


FiKure  9. 


Figure  10. 


Figure  11. 


Figure  13. 


APPENDIX  N 

B.  Sahiner  and  A.E.  Yagle,  “Local  Reconstruction  from  Projections  using 
Exponential  Radial  Sampling,”  submitted  to  IEEE  Trans.  Image  Proc.,  July 
1993. 

This  paper  shows  that  the  local  tomography  problem  of  reconstructing  only  a  small 
region  of  interest  (ROI)  from  a  limited  set  of  projections  can  be  solved  by  sampling  the 
projections  at  a  rate  that  decreases  exponentially  with  distance  from  the  ROI.  This  recon¬ 
structs  the  ROI  with  high  resolution,  and  the  remainder  of  the  image  at  lower  resolution. 
The  algorithm  is  also  much  faster  than  conventional  filtered  backprojection. 


LOCAL  RECONSTRUCTION  FROM  PROJECTIONS 


USING  EXPONENTIAL  RADIAL  SAMPLING 

Berkman  Sahiner  and  Andrew  E.  Yagle 
Dept,  of  Electrical  Engineering  and  Computer  Science 
The  University  of  Michigan,  Ann  Arbor,  Michigan  48109-2122 

June  1993 

Abstract 

We  combine  several  ideas,  including  nonuniform  sampling  and  circular  harmonic  ex¬ 
pansions,  into  a  new  procedure  for  reconstructing  a  small  region  of  interest  (ROI)  of  an 
image  from  a  set  of  its  projections  that  are  densely  sampled  in  the  ROI  and  coarsely 
sampled  outside  the  ROI.  Specifically,  the  radial  sampling  density  of  both  the  projections 
and  the  reconstructed  image  decreases  exponentially  with  increasing  distance  from  the 
ROI.  The  problem  and  data  are  reminiscent  of  the  recently-formulated  local  tomography 
problem;  however,  our  algorithm  reconstructs  the  ROI  of  the  image  itself,  not  the  fil¬ 
tered  version  of  it  obtained  using  local  tomography.  The  new  algorithm  has  the  added 
advantages  of  speed  (it  can  be  implemented  entirely  using  the  FFT)  and  parallelizability 
(each  image  harmonic  is  computed  independently).  Numerical  examples  compare  the  new 
algorithm  to  filtered  backprojection. 
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I.  INTRODUCTION 


The  problem  of  image  reconstruction  from  a  complete  set  of  projections  is  to  compute  an  image 
Hix.y)  from  a  complete  set  of  its  line  integrals  p{r,6),  defined  as 

/OO  AGO 

/  p,{x^y)S(r  —  XCOS0  —  y  s'm  9)dxdy.  (1) 

‘CO  J^oo 

The  most  common  procedure  for  reconstruction  irom  a  complete  set  of  projections  is  filtered 
backprojection  (FBP).  In  FBP  the  projections  p{r,9)  are  first  filtered  with  a  filter  h{r)  whose 
Fourier  transform  h{w)  ^  |ro|  up  to  some  cutoff  frequency,  and  is  windowed  to  zero  for  higher 
frequencies.  These  filtered  projections  are  then  backprojected  [1],  When  the  projections  are 
sampled  in  the  angular  and  radial  variables,  but  cover  the  entire  extent  of  the  image,  FBP  still 
yields  quite  satisfactory  results.  The  resolution  of  the  reconstructed  image  is  determined  by 
the  sampling  densities  in  r  and  6  of  p{r,9)  and  the  cutoff  frequency  o{  h{w). 

In  many  applications,  it  is  not  possible  to  obtain  a  complete  set  of  projections  which  are 
sampled  densely  enough  to  attain  the  desired  resolution  over  the  entire  support  of  p,{x,  y).  For 
example.  X-ray  dose  limitations,  or  time  constraints  when  imaging  a  moving  object  [2],  may 
preclude  such  a  large  number  of  projections.  If  the  entire  support  of  p,{x,y)  is  covered,  but 
projections  are  not  sampled  densely  enough,  then  the  desired  resolution  is  not  attained.  If  the 
projections  are  dense  enough  around  some  region  of  interest  (ROI)  of  p{x,y),  but  do  not  cover 
the  entire  support  of  p{x,y),  then  a  good  reconstruction  using  FBP  is  not  possible,  due  to  the 
infinite  support  of  the  filter  h{r)  (ideally,  h{r)  is  a  derivative-Hilbert  transform  Tid/dr). 

This  is  one  reason  why  local  tomography  was  introduced  in  [2]- [4].  In  local  tomography, 
h{r)  no  longer  approximates  Tid/dr.  Instead,  the  local  filter  h{r)  =  (P/dr'^  +  a(5(r)  is  used, 
and  the  Fourier  transform  p{wi,W2)  of  the  reconstructed  image  p{x,y)  is  related  to  the  Fourier 
transform  fi{wi,W2)  oi  p{x,y)  by  p,{wi,W2)  =  •sjwl  -f-  W2p{wx,W2)-\- ap{wi,W2)  j yj w\  -f  u;|.  The 
idea  is  that  since  this  h{r)  is  local,  only  projections  passing  through  the  ROI  are  used;  no  other 
projections  need  be  taken.  Local  tomography  was  used  successfully  to  image  the  coronary 
arterial  iree  in  [2].  However,  it  is  clear  that  fi{x,y)  ^  p{x,y)-,  for  example,  constant  regions 


# 
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of  /i(x,y)  tend  to  become  cup-shaped  functions  fi{x,y)  [4],  Furthermore,  local  tomography  is 
even  more  susceptible  to  noise  than  FBP,  due  to  the  extra  noise-amplifying  +  Wo 

In  this  paper,  we  introduce  a  different  type  of  local  tomography,  based  on  exponential 
radial  sampling  of  the  image  and  projections.  We  assume  that  we  are  interested  in  obtaining 
high  resolution  only  in  a  small  ROI;  outside  this  region,  high  resolution  is  not  very  important. 
Without  loss  of  generality,  we  assume  that  the  ROI  is  centered  on  the  origin  (this  can  easily  be 
achieved  by  translating  ii{x,y)).  The  angular  sampling  is  conventional  equiangular  sampling, 
i.e.,  p{r,9)  is  sampled  in  6  at  angles  =  ^n,  n  =  0, 1, . . . ,  iV  -  1.  However,  the  radial  sampling 
in  r  in  p{r,  9)  and  p  in  p{x,  y)  =  p{p,  (j))  (polar  coordinates)  is  exponential,  i.e.,  p(r,  9)  is  sampled 
in  r  at  distances  k  >  1. 

This  means  that  the  samples  are  very  dense  around  the  origin  (i.e.,  in  the  ROI),  and  the 
sampling  density  decreases  exponentially  with  increasing  distance  from  the  origin.  This  gives 
us  good  resolution  around  the  origin  (in  the  ROI),  and  poor  resolution  far  away  from  the 
origin  (which  is  irrelevant).  These  remarks  apply  both  to  the  data  (the  projections  p{r,6))  and 
the  reconstructed  image  /x(p,  ^).  Although  the  exponential  decrease  of  sampling  density  with 
increasing  r  is  not  as  sharp  as  the  abrupt  drop  of  sampling  density  to  zero  in  local  tomography 
of  [2]-[4],  it  is  quite  steep,  regardless  of  the  value  of  A,  and  it  is  clearly  in  the  spirit  of  localizing 
the  projection  data  in  a  ROI.  It  shares  the  advantages  of  the  local  tomography  of  [2]-[4]  (viz., 
using  less  data,  with  attendant  smaller  X-ray  exposure).  And  it  has  a  significant  advantage 
over  the  local  tomography  of  [2]-[4];  /i(x,y),  not  /i(x,y),  is  computed  in  the  ROI. 

For  image  reconstruction  using  exponential  radial  sampling,  we  use  the  circular  harmonic 
decomposition,  which  is  a  Fourier  expansion  in  the  angular  variable  9  or  <j).  This  decomposition 
has  been  applied  to  reconstruction  from  projections  in  [5]-[7].  However,  [5]-[7]  used  either 
continuous  variables  or  uniform  sampling,  while  we  use  exponential  radial  sampling.  This 
creates  two  advantages:  (1)  it  results  in  a  local  tomography  problem,  as  described  above;  and 
(2)  the  reconstruction  formula  can  be  written  as  a  regular  convolution  for  each  harmonic.  Since 
the  fast  Fourier  transform  (FFT)  can  be  used  to  implement  these  convolutions,  all  in  parallel. 


this  results  in  a  reconstruction  algorithm  that  is  an  order  of  magnitude  faster  than  FBP.  In  [9] 
exponential  sampling  was  used  to  make  the  Abel  transform  a  convolution:  however,  that  was  a 
one-dimensional  problem,  while  this  paper  treats  a  two-dimensional  local  tomography  problem. 

We  use  a  bilinear  polar-to-rectangular  coordinate  conversion  algorithm  to  display  the  final 
image.  The  conversion  algorithm  can  produce  images  which  represent  fi{x,y)  at  different  res¬ 
olutions,  i.e.,  a  small  area  of  fi{x,y)  in  the  ROI  with  very  good  resolution,  or  a  larger  area  of 
jj{x,y)  with  less  resolution.  Since  resolution  increases  exponentially  with  decreasing  distance 
from  the  origin,  we  will  be  able  to  zoom  in  on  the  ROI  with  little  loss  of  resolution. 

The  paper  is  organized  as  follows.  In  Section  II,  we  give  definitions  and  review  circular  har¬ 
monic  image  reconstruction  algorithms  of  [5]- [7].  In  Section  III,  we  apply  exponential  sampling 
and  derive  a  new  fast  algorithm  for  reconstructing  a  small  ROI  from  exponential  radial  sampled 
projection  data.  In  Section  IV,  we  present  and  discuss  some  numerical  examples,  and  compare 
our  results  with  those  from  FBP  with  regular  sampling.  Section  V  concludes  with  a  summary. 

II.  CIRCULAR  HARMONIC  IMAGE  RECONSTRUCTION 

Let  fi{p,  (f))  denote  the  image  in  polar  coordinates.  Since  both  the  image  and  its  projection 
p(r,  6)  are  periodic  in  the  angular  variable  with  period  27r,  they  can  be  expanded  in  Fourier 
series  (circular  harmonic  decompositions  [5]-[7]) 

OO  CO 

n=— OO  n=— OO 

where 

=  =  p{r,9)e-^'^Ue  (3) 

are  the  circular  harmonics  of  p{p,  (f))  and  p{r,  9),  respectively.  Since  /x(/j,  (f>)  and  p{r,  9)  are  both 
real,  we  have  p-n{p)  =  fJ-nip)  P-n(^)  =  Pn(r)-  We  now  show  that  Pn{p)  can  be  computed 
from  independently  for  each  n. 


A.  Original  Cormack  Formula 


Cormack  [5]  was  the  first  to  use  circular  harmonics  in  reconstruction  from  projections.  He 
showed  that  the  circular  harmonics  fi-nip)  of  the  image  can  be  obtained  from  the  circular  har¬ 
monics  Pn[r)  of  the  projections  using 


,  X  1  r  P'niOTnir  I  p)  ^ 

Pn{p)  =  -z  - .  dr, 

Jo 


TT 


(4) 


where  p'Ar)  =  dp„(r)/<ir  and  T„(x)  =  cos(n 


cos 


-1 


:)  is  the  Chebyshev  polynomial  of  the  first 


kind  of  order  n. 

The  Cormack  formula  (4)  has  been  called  the  “causal,  unstable”  form  of  circular  harmonic 
image  reconstruction  [6].  Equation  (4)  is  causal  in  that  Pn{p)  depends  only  on  Pn{r)  for  r  >  p; 
thus  (4)  solves  the  exterior  Radon  transform  problem  of  reconstructing  {p{p,<i)),p  >  Rq}  from 
{p{r,6),r  >  Rq}.  It  is  unstable  in  that  for  large  r/p,  Tn{rlp)  behaves  like  (r/p)”,  therefore  the 
integrand  in  (4)  becomes  very  large,  especially  for  small  p.  This  instability  makes  (4)  almost 
useless  for  image  reconstruction  from  projections.  We  do  not  use  (4)  in  this  paper. 


B.  Stable  Noncausal  Formula 

There  also  exists  a  “noncausal,  stable”  form  of  circular  harmonic  reconstruction  [6],  [7]: 

/  ^  ^  rr  /  I  s  f  ,  ^  \  exp(— n  cosh“^ (r/p))  ,  ,  ,  , 

=  -  /  t^n-i(r/p)p„(r)dr  -  -  /  -  - ^p„  r  dr,  5 

r  Jo  TT  Jp  —  p^ 

where  Un{x)  =  sin((n  +  1)  cos“^  x)/  sin(cos“^  x)  is  the  Chebyshev  polynomial  of  the  second  kind 
of  order  n.  Equation  (5)  is  noncausal  in  that  Pn(p)  depends  on  p„(r)  for  all  r,  not  just  r  >  p, 
but  the  integrands  are  stable,  in  that  they  are  bounded  as  n  — >  oo.  Chapman  and  Cary  [7]  have 
used  the  noncausal,  stable  form  (5)  to  perform  fairly  accurate  reconstructions  from  regularly 
sampled  projections. 


C.  Reformulation  of  (4)  as  Convolution 

We  now  apply  an  idea  first  used  in  [9]  on  the  Abel  transform  to  the  Cormack  formula  (4).  Al¬ 
though  we  do  not  use  the  result  further,  it  provides  a  simple  illustration  of  the  more  complicated 


transformation  we  will  apply  in  Section  III  to  the  discretized  form  of  (5). 

Make  the  change  of  variables  r  =  e~‘,  p  =  e~'^ ,  and  define  Pnir)  =  Pn{p  =  and 
pnii)  =  Pn{^  =  Then  (4)  becomes 


Pn[r] 


1  /■°° 

dp{t) 

TT  J — oo 

dt 

t  Ue^") 

e  ;  '  '  ■  ■■  '  l(r  —  t) 


Vl  -  e-2(’'-0 

Tn{e^) 

=  ZPn(T>e  - - 


TT 


vT 


(6) 


where  *  denotes  convolution  and  1(f)  is  the  unit  step  function.  Thus  the  transform  defined  in 
(4)  has  become  a  simple  filtering  operation,  which  can  be  implemented  using  the  FFT.  Indeed, 
it  suggests  that  the  exponential  warping  of  r  and  p  is  a  more  natural  formulation,  since  (4) 
represents  a  time- varying  filter  and  (6)  represents  a  time-invariant  filter. 


III.  DERIVATION  OF  THE  ALGORITHM 


A.  Problem  Specification 

The  problem  that  we  solve  in  this  section  is  defined  as  follows.  Given  samples  of  the  projections 
of  an  image,  where  the  sampling  density  is  exponential  in  the  radial  variable  and  equiangular 
in  the  angular  variable,  compute  the  image  on  the  same  grid.  That  is,  given 

=  ^n,  n  =  0, 1, . . . ,  -  l|  (7) 

compute  {p{pk^(f)n)]  for  analogous  values  of  pk  and  For  convenience  we  define  tq  =  po  =  0. 

We  assume  that:  (1)  the  image  (and  hence  the  projections  also)  is  known  to  have  its  support 
inside  a  disk  of  radius  /?;  and  (2)  the  image  (and  hence  the  projections  also)  is  known  to  have 
only  N  circular  harmonics  /i„(p)  significantly  different  from  zero.  ri  =  pi  is  the  smallest  radius 
of  interest,  and  tk  —  Pk  ==  Ri  so  that  {K  —  1)A  =  ln(i?/ri).  Note  that  the  grid  is  very  dense 
around  ri,  and  much  sparser  around  vk,  dropping  off  exponentially  with  increasing  radius, 
regardless  of  the  size  of  A. 


m 


6 


B.  Discretization 


VV'e  first  discuss  how  to  obtain  the  harmonics  Hn{p)  of  the  image  from  the  harmonics  Pni/)  of 
the  projections.  Since  we  require  image  harmonics  only  for  n  >  0.  Changing 

variables  from  r  to  x  =  cos~^  (r/p)  in  the  first  integral  of  (5)  and  x  =  cosh~^(r//9)  in  the  second 
integral  of  (5),  and  substituting  p  =  pj,  (5)  can  be  written  as  [7] 


i{P])  —  ~  sm{nx)p'^{pj  cos  x)dx - /  e  cosh  x)dx 

TT  Jo  TV  Jo 

Y  rx—-Kj2  1  yx=cosh“*  (H/pj) 

/  p^(p_,  cos  x)d(cos  nx)  q - / 

J x=Q  TTTZ  J x=0 


1  rcosh  '{RIpj) 


7rn 


p'niPj  coshx)d(( 


(8) 


We  now  generalize  the  result  of  [7].  Let  numbers  such  that 


<  Xjp  =  7r/2  and  0 


^jj+i  ^ 


<  =  cosh  {R/pj).  (9) 


Then,  following  [7],  it  is  ea.sily  shown  that  a  first  approximation  to  (8)  for  n  0  is  given  by 


1  j  —  l  I  K-~\ 

Pn{pj)  =  —  XI  (cos(nxy*:+i)  -  cos(nx_,-,i))  +  —  X  (10) 

lc=0  k=j  ^  ' 

where  a^jik),  which  approximates  p'^ipj  cosx)  or  p^ipj  coshx),  is  defined  as 

f  Pn{pj  cos  Xyib+i)  -  p„(pj  COS  Xj, fc) 


^n,j{k)  —  < 


Pi  cos  Xj^if^i  pj  cos  Xj^if 

Pr,{pi  cosh  Xj,k+l)  -  PrxjPj  COsh  Xj^k) 
Pi  cosh  Xj,fc+i  —  Pi  cosh  Xj^k 


k  =  0,...J  -  I 


,  k  =  j, . . . ,  K  -  1. 


(11) 


C.  Exponential  Radial  Sampling 

All  of  the  following  results  are  new.  Instead  of  choosing  Xj,^  =  cos~^  {k/j)  or  cosh”^(fc/j)  as  in 
[7],  we  choose 

/ 

cos“^  ,  1  <  <  i 


^i,k  —  X^i-k  —  \ 


cosh  *  ,  j  <  k  <  K, 

and  we  recall  from  (9)  that  xyo  =  'x  12  for  all  j.  This  results  in 

_  -Pn(rfc) 

—  ^n\k}  —  , 


(12) 


^t+i  ~ 


(13) 


which  again  is  clearly  a  discrete  representation  of  p^(r).  Defining 

{cosna;,_i  —  cosnx,  7  >  0 

and  substituting  in  (10),  we  obtain  the  main  result  of  this  paper: 
1 


(14) 


A'-l 


,n  /  0,j  7^  0.  (15) 


finipj)  =  —  |an(0)(cos  nXj_i  -  cosnx/2)  +  ^  a„(^)s„(j  -  k) 

h-l 

Equation  (15)  computes  the  exponentially-sampled  image  harmonics  fJ-n{pj)  from  the  expon¬ 
entially-sampled  projection  harmonics  pn(7’;i;)-  First,  (13)  “differentiates”  Pni^k)’,  then  the  result 
is  convolved  with  Sn{j)  to  compute  PniPj)  (the  first  term  in  (15)  is  an  end  effect).  Note  that 
this  is  reminiscent  of  (6);  however,  it  comes  from  the  noncausal,  stable  form  (5),  not  the  causal, 
unstable  Cormack  formula  (4). 

Note  that  (15)  cannot  be  obtained  by  a  simple  discretization  of  the  continuous  result  of  [6], 
due  to  the  end  effect  and  sampling  points.  In  [6]  the  change  of  variables  r  =  was  applied 
directly  to  (5)  without  much  discussion  of  the  result,  and  no  discretization  was  performed. 
Our  result  (15)  differs  from  the  results  of  [6]  and  [7]  in  the  following  ways:  (1)  unlike  [6],  we 
perform  the  change  of  variables  (8);  (2)  unlike  [6],  our  result  is  explicitly  discrete  and  directly 
suitable  for  computer  processing  of  sampled  data;  and  (3)  unlike  [7],  we  use  exponentially- 
sampled  data.  These  differences,  and  the  novel  application  to  local  tomography,  justify  calling 
(15)  a  new  result. 


For  the  special  cases  n  =  0  and  j  =  0,  it  can  be  shown  that  [7] 


1 


K-l 


Mpj)  = - E  an{k){xj.k-i  -  Xj-k) 


and 


Pr,{po)  =  Pn{0)  =  < 


D.  Equiangular  Sampling 


K-l 

2ao(0)  +  A  Oo(^) 
fc=i 


0, 


n  =  0 
n  7^  0. 


(16) 


(17) 


Since  we  are  given  p{rk,0n)  for  9n  =  =  0,  —  1,  we  consider  p{r,0)  =  0  for  9  ^ 


2-k 


^n.  Since  p{rk,9)  is  discrete  and  periodic  in  9,  its  Fourier  transform  is  also  discrete  and 


9 


9 


8 


periodic.  Since  by  assumption  p{rk,  B]  is  angularly  bandlimited,  (3)  becomes  the  discrete  Fourier 
transform 

1  ^ 

(IS 


Pn{rk)  =  —  -'•'i'"',  n  =  -A72  +  1,..., 'V/2. 

27r 

Similarly,  (2)  also  becomes  a  discrete  Fourier  transform. 

In  actual  application.  p{p,6)  (and  hence  p{r,B)  also)  is  unlikely  to  be  bandlimited  to  N 
harmonics.  Hence  some  aliasing  may  be  e.xpected.  However,  the  higher-order  harmonics  tend 
to  be  smaller  near  the  origin,  i.e.,  in  the  ROI.  Also,  to  reduce  ringing  effects  caused  by  the 
sudden  truncation  of  the  circular  harmonic  expansion  (2),  we  use  not  (2)  but  a  windowed 
version  of  (2) 

N/2 

(19) 


n=-N/2+l 

where  implements  a  Hamming  window.  A  similar  idea  was  used  in  [8]. 

E.  Summary  of  Algorithm 

Given  ^p{rk,  Bn),  rk  =  k  =  Bn  =  ^n,  n  =  0, 1, . . . ,  Af  -  l]-: 

1.  Compute  Pn(^fc)  using  (18)  and  FFT,  in  parallel  in  n; 

2.  Compute  a„(fc)  using  (13),  in  parallel  in  n; 

3.  Compute  Pn{pj)  using  (15)  and  FFT,  in  parallel  in  n; 

4.  Compute  p{pj,  <t>i)  using  (19)  and  FFT,  in  parallel  in  n. 


Finally,  we  compute  the  number  of  operations  required  to  carry  out  our  algorithm.  All 
of  the  equations  can  be  implemented  using  the  FFT;  for  N  =  K,  this  requires  0{N‘^  log  N) 
operations.  Since  both  (13)  and  (15)  can  be  parallelized  in  n,  an  even  greater  computational 
speedup  is  possible.  By  comparison,  FBP  requires  0{N^)  operations  to  compute  the  image  on 
a.  N  X  N  grid.  The  computational  savings  is  thus  a  factor  of  0{N/ log  N). 
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IV.  NUMERICAL  EXAMPLES 


A.  Numerical  Procedures 

To  demonstrate  the  effectiveness  of  our  algorithm  in  achieving  high  resolution  in  a  region  of 
interest  of  the  image  while  minimizing  artifacts,  we  present  some  simulations  using  the  Shepp- 
Logan  phantom  shown  in  Figure  1.  The  ROI  is  defined  to  consist  of  the  three  small  ovals  at 
the  bottom.  Accordingly,  the  image  has  been  translated  in  the  y-direction  by  0.605,  so  that  the 
small  circle,  surrounded  by  two  small  ovals,  is  now  in  the  center  of  the  image.  All  of  the  images 
shown  in  this  section  are  displayed  on  a  256  x  256  grid,  and  for  the  projections  N  =  512  and 
K  —  128  (i.e.,  128  views  at  512  angles).  For  exponential  sampling,  ri  =  0.01,  =  R  =  1.6,  so 

A  =  ln(1.6/0.01)/127  =  0.039962. 

To  display  our  reconstructed  images,  we  use  a  bilinear  polar-to-rectangular  interpolation 
algorithm.  That  is,  let  (x^y)  be  a  point  at  which  interpolation  is  desired.  Let  {x,y)  =  (pN)  lie 
inside  the  trapezoid  with  corners  having  polar  coordinates  {{pk,  <i>k)i  (Pk+i:  (PkNk+i), 

[pk+iNk+i)}-  Then  the  interpolated  value  is 

ND 

p{^,y)  =  -^[{p- pk){<l>-(l>k)p{pk+u<l>k+i)  +  {p- pk){4>k+i-<f>)p{pk+iNk) 

+  {pk+i  -  p){<l>- <f>k)p{pkNk+i)  +  {pk+i  -  p){<l>k+i  -  (f>)p{pk,<f>k)]  (20) 

where  A  =  7. — Vr  ^.nd  D  —  — - — . 

Suppose  that  we  are  using  this  interpolation  algorithm  to  display  a  256  x  256  image  covering 
the  entire  phantom.  Then  pixels  far  away  from  the  origin  of  the  image  will  be  interpolated  using 
reconstructed  polar  values  p,{pjNn)  that  are  not  very  close  to  those  pixels,  while  pixels  close 
to  the  origin  of  the  image  will  be  interpolated  using  p{pj,<f>n)  that  are  very  close  to  them.  In 
such  a  situation,  many  values  of  p,{pj,<f>n)  with  very  small  pj  will  not  be  used  at  all  in  the 
interpolation  to  a  rectangular  grid,  since  they  will  not  be  closest  to  any  pixel.  Hence,  we  may 
“zoom  in”  to  the  origin  of  the  image,  i.e.,  display  256  x  256  images  that  cover  smaller  and 
smaller  areas  around  the  origin. 
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To  evaluate  our  results,  we  compare  them  to  analogous  results  obtained  using  the  same 
number  {NK)  of  samples,  but  with  uniform  sampling  in  r,  reconstructed  using  FBP  [ll.  The 
projections  p(rfc,0„)  are  collected  for  ^  x  A:  =  0, .  .  .  ,  A'  —  1;  for  r  >  we  set 

p(j'-(^n)  =  p(^A'-iii9n)  (this  proved  to  be  surprisingly  effective,  much  more  effective  than  setting 
these  values  to  zero).  The  parameter  A  determines  the  maximum  radius  that  is  sampled;  as  .4 
increases,  a  smaller  region  is  sampled  more  finely, 

B.  Discussion  of  Results 

Figs.  2-5  compare  the  results  from  our  algorithm  (Figs.  2a-5a)  to  those  from  FBP  [1]  (Figs. 
2b-5b).  The  successive  close-up  views  were  obtained  as  follows.  For  our  algorithm,  the  excess 
of  fi{pk,0n)  near  the  origin  allows  us  to  “zoom  in”  to  the  origin,  as  explained  above.  For  FBP, 
we  used  /I  =  1,  2,  4  and  8  to  generate  Figs.  2b-5b,  respectively,  as  explained  above. 

Comparing  Fig.  2a  and  Fig.  2b,  it  is  clear  that  the  overall  FBP  image  is  sharper,  but  in 
the  ROI  our  exponential  sampling  algorithm  produces  a  sharper  image.  In  Fig.  2a,  note  the 
poor  resolution  at  the  top  of  the  image.  This  is  as  expected  -  this  region  is  far  from  the  origin, 
so  its  resolution  should  be  poor.  But  the  ROI  at  the  bottom  of  the  image  is  very  sharp. 

Zooming  in  on  the  ROI  in  Figs.  3-5,  our  algorithm  continues  to  produce  a  sharp  image, 
with  only  a  few  faint  circular  artifacts.  In  contrast,  FBP  produces  an  image  in  which  the  three 
ovals  are  almost  washed  out.  This  is  the  familiar  “dishing”  artifact  (named  by  Kak  [10]),  in 
which  the  image  is  artificially  bright  near  its  center.  Note  that  this  is  a  very  serious  error,  since 
the  three-oval  ROI  lies  inside  another  oval,  which  must  also  be  reconstructed  correctly  (i.e.,  the 
constant  but  non-zero  background  must  also  be  reconstructed). 

The  dishing  artifact  in  FBP  is  caused  by  the  infinite  support  of  the  filter  h{r).  In  Fig.  5c, 
A  =  I  and  FBP  is  used  to  reconstruct  only  the  square  [—0.2, 0.2]  x  [—0.2, 0.2].  Since  the  pro¬ 
jections  now  cover  the  support  of  the  image,  there  are  no  dish  artifacts,  but  the  reconstruction 
is  much  more  blurred  than  the  result  of  our  algorithm  (Fig.  5a).  Also  note  that  FBP  requires 
a  different  set  of  projections  for  each  of  Figs.  2b-5b,  while  our  algorithm  uses  the  same  set  of 


projections  for  each  of  Figs.  2a-5a. 


V.  CONCLUSION 

We  have  proposed  a  new  algorithm  for  reconstructing  a  small  ROI  of  an  image  from  a 
set  of  its  projections  that  have  been  exponentially  sampled  in  the  radial  variable.  Unlike  the 
similar  local  tomography  problem,  the  ROI  of  the  image  itself,  not  a  filtered  version  of  it,  is 
reconstructed.  The  algorithm  draws  on  previous  work  on  using  the  circular  harmonic  expansion 
for  image  reconstruction  from  projection.  The  major  contributions  of  this  paper  include;  (1) 
recognition  of  the  applicability  of  an  exponential  radial  sampling  density  to  a  local  tomography 
problem;  (2)  implementation  of  a  discrete  equation  using  exponential  sampling  (previous  work 
used  a  continuous  exponential  transform  in  a  continuous  equation);  and  (3)  numerical  examples 
demonstrating  the  better  performance  of  the  algorithm  on  local  tomography,  as  compared  to 
FBP.  The  algorithm  is  also  very  fast  (implemented  entirely  using  the  FFT)  and  parallelizable 
(each  harmonic  is  treated  independently). 

An  important  topic  for  further  research  is  to  determine  the  minimum  number  of  view  an¬ 
gles  required  to  obtain  a  good  reconstruction  in  the  ROI.  Fewer  image  harmonics  should  be 
needed  near  the  origin,  and  since  the  harmonics  are  independent,  fewer  projection  harmonics  are 
needed.  The  problem  is  to  determine  a  small  set  of  view  angles  that  would  enable  computation 
of  only  the  smallest  projection  harmonics.  This  is  a  topic  of  current  research. 
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Figure  Headings 


Fig.  1.  The  Shepp-Logan  phantom. 

Fig.  2a.  The  result  from  our  algorithm,  the  entire  image  in  the  square  [—1.6, 1.6]  x  [-1.6, 1.6]  is  displayed 
on  a  256  X  256  grid. 

Fig.  2b.  The  result  from  FBP,  with  sampling  parameter  A  =  1. 

Fig.  3a.  Same  as  Fig.  2a.,  only  the  square  [—0.8,0. 8]  x  [-0.8,0. 8]  is  displayed. 

Fig.  3b.  The  result  from  FBP,  with  sampling  parameter  A  =  2. 

Fig.  4a.  Same  as  Fig.  2a.,  only  the  square  [— 0.4,0.4]  x  [—0.4,0. 4]  is  displayed. 

Fig.  4b.  The  result  from  FBP,  with  sampling  parameter  A  =  4. 

Fig.  5a.  Same  as  Fig.  2a.,  only  the  square  [— 0.2,0.2]  x  [—0. 2,0.2]  is  displayed. 

Fig.  5b.  The  result  from  FBP,  with  sampling  parameter  A  =  8. 

Fig.  5c.  The  result  from  FBP  with  sampling  parameter  A  =  1,  only  the  square  [-0.2,0. 2]  x  [-0.2, 0.2]  is 
displayed. 


APPENDIX  O 

B.  Sahiner  and  A.E.  Yagle,  “Reconstruction  from  Projections  under  Time- 
Frequency  Constraints,”  submitted  to  IEEE  Trans.  Med.  Imag.,  August  1993. 

This  paper  derives  a  fast  image- domain  filter  which  solves  the  following  constrained 
inverse  Radon  transform  problem:  Given  constraints  on  certain  wavelet  coefficients  of  the 
image,  compute  from  its  projections  the  image  which  either:  (a)  requires  the  smallest 
perturbation  of  the  projection  data  to  satisfy  these  constraints;  or  (b)  is  the  constrained 
linear  least-squares  image  estimate.  The  wavelet  transform  can  be  used  for  spatially- 
varying  filtering  of  an  image,  suppressing  noise  locally  in  smooth  regions;  we  also  discuss 
detection  of  such  regions  in  a  noisy  image,  which  leads  to  the  wavelet  coefficient  constraints. 
Numerical  results  show  improvement  over  filtered  images,  since  the  constraints  improve 
the  reconstruction  in  non-constrained  areas  as  well. 
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Abstract 

Low-pass  filtering  computed  tomography  (CT)  images  to  reduce  noise  may  smooth 
or  modify  image  features  which  are  very  important  to  the  physician.  Image  features  are 
often  more  easily  identified  and  processed  in  the  time-frequency  plane.  We  use  time- 
frequency  distributions  for  spatially-varying  filtering  of  noisy  CT  images,  constraining 
time-frequency  representation  coefficients  of  the  projection  data  or  of  the  reconstructed 
image  to  be  zero  in  certain  regions  of  the  time-frequency  plane.  We  consider  two  different 
applications:  (1)  filtering  the  projection  data,  and  then  performing  image  reconstruc¬ 
tion;  and  (2)  filtering  the  reconstructed  image  directly.  Criteria  minimized,  subject  to 
constraints,  may  be  either  a  deterministic  minimum  weighted  perturbation  of  the  given 
projection  data,  or  a  stochastic  minimum  mean-square  error  in  colored  Gaussian  noise. 
Results  show  improvement  over  processing  the  image  with  a  linear  spatiaUy-invariant 
filter. 
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I.  INTRODUCTION 


Image  reconstruction  from  projections  in  computed  tomography  amounts  to  finding  the  in¬ 
verse  Radon  transform  of  the  projection  data,  which  is  most  often  computed  using  the  filtered 
backprojection  method.  A  problem  with  the  inverse  Radon  transform  is  that  the  ramp  filter 
iQiw)  ~  |icj  in  (2)  below)  used  in  the  filtering  stage  of  the  filtered  backprojection  method 
amplifies  the  high-frequency  components  of  the  noise. 

In  medical  images,  noise  usually  dominates  at  high  frequencies,  i.e.,  the  Fourier  content  of 
the  noiseless  image  is  usually  small  at  high  frequencies,  whereas  the  Fourier  content  of  the  noise 
is  relatively  large  (e.g.,  white  noise).  The  high-frequency  content  of  the  image  is  due  mostly  to 
local  “image  features”,  such  as  edges.  If  the  image  does  not  contain  features  such  as  local  sharp 
intensity  variations  or  small  objects  embedded  in  noise,  or  if  such  features  are  unimportant, 
then  the  noise  can  be  reduced  by  processing  the  image  with  a  linear,  spatially-invariant  low- 
pass  filter.  However,  in  many  cases,  these  local  features  are  of  paramount  importance,  and  they 
are  precisely  what  the  physician  needs;  therefore,  modifying  them  by  a  low-pass  filter  may  be 
unacceptable.  This  is  one  reason  why  nonlinear  and  spatially-varying  techniques  are  frequently 
used  to  process  medical  images  [1,  2,  3,  4]. 

Time-frequency  representations  are  a  general  framework  in  which  perceptually  significant 
signal  features,  such  as  the  contours  of  an  object,  are  both  more  easily  identified  and  more  easily 
processed  [5].  Representations  based  solely  on  spatial  variables  do  not  provide  readily  available 
information  about  the  Fourier  content  of  the  data,  which  is  important  for  filtering.  On  the  other 
hand,  representations  bcised  solely  on  the  Fourier  analysis  do  not  provide  information  about 
localization  of  significant  image  features.  A  time-frequency  representation  is  an  attempt  to 
provide  the  “evolutionary  spectrum”  of  the  data,  combining  the  advantages  of  representations 
in  both  domains.  In  this  paper,  we  employ  the  discrete  short-time  Fourier  transform  (DSTFT) 
and  the  orthogonal  discrete  wavelet  transform  (DWT)  as  time-frequency  representations.  For 
a  detailed  discussion  of  time-frequency  representations  see  [6],  and  for  more  discussion  on  the 
wavelet  transform  see  [7]. 
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For  a  one-dimensional  function  of  time,  a  time-frequency  representation  is  a  mapping  into 
a  two-dimensional  function  of  time  and  frequency,  which  localizes  the  signal  energy  in  both 
time  and  frequency  directions  [8].  The  two-dimensional  plane  on  which  the  time-frequency 
representation  is  defined  is  referred  to  as  the  time-frequency  plane  [6].  The  localization  i- 
achieved  by  expressing  the  original  function  in  terms  of  basis  functions  which  are  effecticrlv 
confined  to  a  compact  region  of  support  in  the  time-frequency  plane  [9].  Conversely,  for  a  gic'en 
region  in  the  time-frequency  plane,  there  are  a  small  number  of  functions  which  are  effectivelv 
nonzero  over  that  region.  Compactness  of  basis  functions  helps  us  in  identifying  the  signal 
features  and  filtering  noise.  Ideally,  a  signal  feature  is  represented  by  as  few  basis  functions 
as  possible.  If  a  feature  is  present  at  a  given  location  in  the  time- frequency  plane,  then  the 
basis  functions  which  cover  that  location  have  a  large  contribution  in  the  reconstruction  of  the 
signal  from  its  time-frequency  representation,  i.e.,  these  basis  functions  are  represented  by  large 
coefficients.  If  the  feature  is  absent,  then  the  corresponding  coefficients  are  relatively  small. 
Thus,  in  a  noisy  situation,  small  coefficients  are  likely  to  have  been  caused  by  noise,  rather  than 
signal  features. 

A  typical  procedure  for  filtering  signals  in  time-frequency  plane  makes  use  of  the  above  idea: 
The  time-frequency  representation  of  the  signal  is  multiplied  by  a  one-zero  mask  [10,  11,  12,  13, 
14].  The  mask  is  set  to  unity  in  the  region  of  the  time-frequency  plane  where  the  energy  of  the 
signal  is  above  a  threshold  (i.e.,  where  signal  features  are  likely  to  be  present)  and  is  set  to  zero 
in  the  region  of  the  time-frequency  plane  where  the  signal  energy  is  below  the  threshold  (i.e., 
where  signal  features  are  unlikely  to  be  present)  [10,  13).  The  filtered  signal  is  then  obtained 
by  mapping  the  modified  time-frequency  representation  back  to  the  signal  domain. 

In  this  paper,  we  apply  time-frequency  representations  to  the  filtering  problem  in  computed 
tomography.  We  look  at  the  options  of  filtering  the  projection  data  and  filtering  the  recon¬ 
structed  image  separately.  This  permits  incorporation  of  any  a  priori  information  about  the 
image  (e.g.,  edge  locations  and  smooth  areas)  directly  into  the  reconstruction  process,  rather 
than  just  post-processing  the  reconstructed  image.  We  will  show  that  this  incorporation  im- 


proves  the  reconstruction  of  unconstrained  regions  as  well  as  constrained  regions.  Meanwhile, 
the  thresholding  approach  allows  us  to  perform  what  amounts  to  spatially- varying  low-pass 
filtering,  in  either  the  projection  domain  or  the  image  domain. 

For  filtering  of  the  projection  data,  we  use  the  thresholding  approach  described  above  for 
two  closely  related  time-frequency  representations:  the  discrete  short-time  Fourier  transform 
and  the  discrete  wavelet  transform.  For  a  given  projection  angle,  we  regard  the  projection  data 
as  a  1-D  function  of  the  radial  variable,  and  compute  the  related  time- frequency  representations. 
After  time-frequency  filtering  is  performed  for  each  view  angle  separately,  the  filtered  image  is 
obtained  using  the  filtered  backprojection  method. 

For  filtering  the  reconstructed  image  directly,  we  first  use  thresholding  in  the  image  domain 
to  set  constraints  on  the  2-D  DWT  of  the  reconstructed  image.  We  then  find  the  filtered  image 
which  satisfies  the  imposed  wavelet  constraints  and  minimizes  either  of  two  error  criteria.  The 
first  criterion,  which  is  deterministic,  is  the  norm  of  the  difference  between  the  given  projections 
and  the  projections  of  the  filtered  image.  The  second  criterion,  which  is  stochastic,  is  the  mean- 
square  error  of  the  filtered  image.  We  show  that  both  criteria  lead  to  similar  algorithms  which 
operate  directly  on  the  reconstructed  image. 

The  paper  is  organized  as  follows.  In  Section  II,  we  briefly  review  the  inverse  Radon  trans¬ 
form,  the  DSTFT  and  the  DWT.  In  Section  III,  we  present  our  algorithm  for  filtering  the 
projection  data,  and  provide  some  examples.  This  extends  our  previous  results  [29,  30]  from 
DSTFT  to  new  results  for  DWT;  we  also  present  many  more  numerical  examples.  In  section 
IV,  we  discuss  our  general  approach  for  setting  wavelet  constraints  in  the  image  domain,  and 
provide  a  statistical  justification  for  thresholding  the  absolute  value  of  the  DWT.  This  extends 
our  previous  result  on  thresholding  [27]  from  continuous  wavelet  transform  to  discrete  wavelet 
transform.  In  Section  V,  we  develop  the  deterministic  formulation  for  processing  the  image, 
given  wavelet  constraints  in  the  image  domain.  In  Section  VI,  we  develop  the  stochastic  formu¬ 
lation,  and  obtain  an  algorithm  similar  to  that  of  Section  V.  This  links  the  result  of  Section  V, 
which  is  new,  to  our  previous  results  [27,  28].  In  Section  VII,  we  provide  some  examples  using 
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the  algorithm  developed  in  Sections  IV'  and  V.  We  conclude  with  a  summary  in  Section 


II.  LWERSE  R.4D0N  TRANSFORMS,  DSTFTS  AND  DWTS 

A.  The  Inverse  Radon  Transform 

The  basic  reconstruction  from  projections  or  inverse  Radon  transform  problem  is  to  reconstruct 
an  image  fi{x,y)  from  its  projections  p{r,6)  where 

/oo  roo 

/  p{x,y)6{r  -  X  cose  -  y  sin  9)dxdy  (1) 

-OO  j  —  OO 

is  the  Radon  transform  of  f.i{x,y).  The  image  is  reconstructed  from  its  projections  using  the 
inverse  Radon  transform 

p{x,y)  =  n-^{p{r,d)}  =  (2) 

where  P{w,  6)  =  6)}  is  the  Fourier  transform  in  the  r  variable  of  p{r,  0),  and  = 

|u;|.  In  practice,  Q{w)  is  a  real  and  symmetric  function  of  w  that  approximates  the  ideal  Radon 
kernel  |u;|  [15,  25]. 

Equation  (2)  can  be  sampled  in  the  image  domain  to  yield  a  discrete  image  /r(7,  J): 

1 

p{I,J)  =  n:Hp{pO)}  =  (3) 

In  practical  problems,  we  have  only  samples  p(m,  n)  of  p(r,  9),  in  which  ceise  samples  p{I,  J) 
of  the  image  p{x,y)  are  obtained  by  discretizing  (2)  into  [15] 

2  N~\  oo  oo 

p{TJ)  =  'Rl^p{m,n)}  =  —  J2  - ‘m)p{m,n)h{i  cosnA  + j  sinn A  -  1),  (4) 

n=0  m=-oo  /=_oo 

where  N  is  the  total  number  of  views  (the  number  of  angles  for  which  projections  are  available), 
p(m,  n)  is  the  projection  in  the  view,  A  =  ir/N,  h{x)  is  an  interpolation  function,  q{m) 
is  a  discrete  filter  that  approximates  q{r)  =  {Q[w)}  and  defines  the  discrete  version 

of  the  inverse  Radon  transform  operator. 


B.  The  DSTFT 


For  a  discrete-time  signal  f{m),  the  DSTFT  is  defined  as  [16]: 

F(/,m)=  fik)aiSm  -  k)e-^'F'‘‘  (5) 

k=  —  oo 

where  £  is  the  transform  size,  S'  <  £  is  the  sampling  period  in  the  time  domain,  /  =  0,  1  ,••■,£-  1 
is  the  frequency  variable,  and  m  is  the  time  variable.  a{m)  is  called  the  analysis  filter,  or  the 
analysis  window. 

To  compute  F{1,  m)  for  a  given  value  of  the  time  variable  m,  the  signal  f[k)  is  first  windowed 
with  a  translated  (by  Sm  samples)  version  of  the  analysis  window  a(^),  and  then  the  discrete 
Fourier  transform  (DFT)  is  computed  (this  can  be  done  exactly  using  an  FFT).  For  different 
values  of  m,  the  analysis  window  is  centered  around  different  portions  of  the  signal  /(Ic),  and  the 
DFT  computes  the  Fourier  contents  of  that  portion  of  the  windowed  signal.  Therefore,  the  time- 
variation  in  the  time-frequency  distribution  F{l,m)  is  specified  by  the  amount  of  translation  of 
the  analysis  window,  and  the  frequency-variation  is  specified  by  the  Fourier  transform.  Note 
that  if  /(m)  is  nonzero  over  an  interval  of  M  samples,  then  its  DSTFT  will  be  nonzero  over 
a  lattice  of  approximately  ^  srtmples  (ignoring  edge  effects).  Since  S  <  C,  the  DSTFT  is  in 
general  redundant. 

With  mild  conditions  on  the  analysis  filter  (e.g.,  the  first  S  samples  of  a{m)  should  be 
nonzero,  or  if  S  divides  £,  none  of  the  S  polyphases  of  a{m)  should  be  identically  zero),  /(m) 
can  be  recovered  exactly  from  F{l,m)  [17].  The  inversion  formula  is 

oo  1  C-l 

/(^)  =  H  d(m  -  ^  F(/,m)e^^'"’  (6) 

k=-co  ^  1=0 

where  a(m),  called  the  synthesis  filter,  is  determined  by  the  analysis  filter  a(m).  An  algebraic 
approach  to  determining  a(m)  for  exact  reconstruction  is  given  in  [18]. 
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a  The  DWT 

The  basic  idea  of  the  orthogonal  discrete  wavelet  transform  is  to  represent  a  sequence  /(/)  as 
a  superposition  of  translations  and  dilations  of  a  wavelet  g{I).  As  opposed  to  the  DSTFT,  no 
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Fourier  transform  is  computed  in  DVVT.  The  frequency- variation  is  specified  by  how  much  the 
wavelet  is  dilated.  Similarly  to  DSTFT,  the  time- variation  is  specified  by  the  translation  ot  the 
wavelet.  The  recursive  formula  for  the  wavelet  decomposition  VT21  /(/)  of  /(/)  is  [21] 

MI)  = 

k 

=  ^<7(2/-^')//-i(A:)  (Tl 

k 

where,  for  a  fixed  scale  /q,  lT2io/(/)  is  called  the  detail  signal  at  scale  Iq.  and  fig{I)  is  called  the 
average  signal  at  scale  Iq.  The  recursion  is  started  with  foil)  =  /(/)■  The  sequences  h{l)  and 
g{I)  are  called  the  scaling  function  and  the  wavelet,  respectively,  and  satisfy 

j(/)  =  (-i)'Mi-n.  (s) 

The  scaling  function  is  usually  a  low-pass  filter,  and  (8)  ensures  that  the  wavelet  is  a  high- 
pass  filter.  At  each  scale  /,  the  average  signal  from  the  previous  scale  is  convolved  with  the 
high-pass  filter  gil),  and  one  sample  out  of  two  is  retained.  The  scale  number  /  acts  like  the 
frequency  variable,  and  the  time-index  /  acts  as  the  time-variable.  Note  that  if  we  start  with 
a  signal  /(/)  which  is  nonzero  over  an  interval  of  M  samples,  then  the  wavelet  representation 
f  (I),  W2^f{I), . . . ,  W2L /(/),/£,(/)}  will  also  have  M  nonzero  samples  (ignoring  end  effects 
due  to  nonzero  length  of  /i(/)  and  g{I))  for  any  L.  Thus,  DWT  is  not  redundant. 

The  recursive  formula  for  the  reconstruction  of  /(/)  from  its  wavelet  transform  W21  /(/)  is 


=  x:  h{2k  -  I)Mk)  +  g{2k  -  l)W2.f{k). 


(9) 


The  conditions  that  /i(/)  and  g{I)  must  satisfy  for  the  orthogonal  decomposition-recons¬ 
truction  of  (7)  and  (9)  to  work  are  [22]: 


1.  Exact  reconstruction  condition; 

Yi  h{h  -  2k)h{l2  -  2k)  -h  g{h  -  2k)gil2  -  2k)  =  8ih  -  h) 

k 

2.  Orthogonality  condition: 

Y.  k{k  —  21\)g{k  —  2/2)  =  0  for  all  Ii  and  l2- 


(10) 


Tl] 
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For  a  2-D  sequence  the  separable,  orthogonal  2-D  wavelet  transform  is 


defined  recursively  as 


ki  k2 

W[S{IJ)  =  A:2)^/-i(^i,A:2),  --  =  1,2.3  (12) 

k2 

where  z  =  1, 2,  3  are  called  sub-wavelets  and  are  defined  by  g^^\l ,  J)=  g{J)h{I),  g^'^\l ,  J)  = 
g{I)h{J),  and  g^^\LJ)=  g{I)g{J)  [21].  Since  h  is  low-pass  and  g  is  high-pass,  the  first  sub¬ 
wavelet  is  low-pass  in  the  /  direction  and  high-pass  in  the  J  direction,  the  second  is  low-pass  in 
the  J  direction  and  high-pass  in  the  I  direction,  and  the  third  is  high-pass  in  both  directions. 
The  signals  2  =  1,2,3  represent  details  of  g  in  the  J,  7,  and  diagonal  directions, 

respectively. 

As  opposed  to  the  recursive  formula  (12),  the  detail  signals  W^^^g{I,J)  and  the  average 
signal  gi{I,J)  can  also  be  computed  directly  from  g{I,J)  for  any  given  /.  The  direct  decom¬ 
position  formula  is 

ki  k2 
fcj  fcj 

Using  induction,  it  can  easily  be  shown  that  the  filters  hi{I^J)  g\^\l^J)  satisfy 

/i,+i(/)  =  Y.Hk)hi{I-2^k) 

k 

=  '£.Y.9'C\kuh  )hi{l  -2^h)hi{J  -2^k2)  (14) 

ki  kj 

where  hi{I)  =  h{I)  and  g[^\l,J)  =  In  words,  hi^x  is  obtained  by  inserting  2^  zeros 

between  the  coefficients  of  /ii,  and  convolving  the  resulting  sequence  with  hg  the  average  signal 
/Z/+1  is  then  computed  by  convolving  g  with  /1/+1  in  the  1  and  J  directions  and  retaining  one 
sample  out  of  every  2'"^^  samples  in  each  direction.  Similar  remarks  apply  to  the  computation 


As  a  direct  consequence  of  the  orthogonality  of  the  wavelets,  it  can  be  shown  that  properl\ 
translated  versions  of  the  filters  are  orthogonal,  i.e.. 


Given  the  detail  signals  for  j  =  1, . . .  .  Z-  and  the  average  signal  mil ■  J)'  the  original  signal 
can  be  recovered  either  using  a  recursive  formula  analogous  to  (9)  [21].  or  using  the 
direct  formula 


Ki,J)  =  EEEIisf'’  (2'Z,  -  I,2‘k2  -  J)W^S{kuk2) 

2=1  /=1  ki  k2 

+  (16) 

fc]  k2 


III.  PROJECTION  CONSTRAINTS 

A  common  technique  for  recovering  a  noise-corrupted  signal  is  filtering  in  the  Fourier  plane, 
in  which  the  Fourier  transform  of  the  signal  is  windowed  to  zero  at  frequencies  where  the  signal 
energy  is  much  smaller  than  the  noise  energy.  A  similar  idea  can  be  applied  in  the  time- 
frequency  plane:  Window  the  time-frequency  representation  of  the  signal  to  zero  (or  constrain 
the  time-frequency  representation  to  be  zero)  in  regions  of  the  time-frequency  plane  where  the 
signal  energy  is  much  smaller  than  the  noise  energy.  This  idea  has  been  applied  to  various 
signals  in.  [10,  11,  12,  13,  14],  and  is  referred  to  as  time-frequency  filtering. 

In  image  reconstruction  from  projections,  filtering  for  noise  is  usually  implemented  in  the 
projection  domain,  and  is  combined  with  ramp  filtering.  With  a  suitably  chosen  q'(m),  both 
noise  filtering  and  ramp  filtering  can  be  accomplished  at  once.  The  choice  of  q{m)  involves  a 
compromise  between  image  resolution  and  and  noise.  If  the  cutoff  frequency  of  q{m)  is  chosen 
to  be  too  large,  then  the  resulting  image  can  be  too  noisy;  if  the  cutoff  frequency  is  chosen 
to  be  too  small,  then  the  resulting  degradation  of  resolution  may  be  too  severe.  To  preserve 
the  resolution,  it  is  desirable  to  attenuate  the  noise  energy  only  where  the  signal  energy  is 
also  small.  However,  spatially-invariant  filters  are  unable  to  change  their  characteristics  in  a 


spatially-varying  manner,  so  this  selectivity  cannot  be  attained.  In  contrast,  time-frequency 
filtering  is  by  definition  spatially-varying,  hence  the  desired  selectivity  may  be  attained  by 
filtering  the  projections  in  the  time-frequency  plane. 

In  this  section,  we  apply  the  idea  of  time-frequency  filtering  to  noisy  projections  p,,(m,  n)  of 
an  image  by  constraining  the  time-frequency  representation  of  p,,(m.n)  to  be  zero  in 

certain  regions  of  the  time-frequency  plane.  For  a  given  view  angle  no,  we  regard  the  projection 
Pr){m,no)  as  a  function  of  m,  and  process  its  time-frequency  representation.  That  is,  m  (a 
spatial  variable)  plays  the  role  of  ‘time’  in  a  time-frequency  representation,  and  we  apply  time- 
frequency  filtering  to  process  the  projection  data  in  a  spatially-varying  fashion.  For  notational 
simplicity,  we  drop  the  variable  n  from  p^{m,  n)  in  the  rest  of  this  section,  and  refer  to  Pnim,,  n) 
as  Pr,{m).  The  procedures  described  below  are  applied  to  each  view  separately.  This  section 
extends  our  previous  results  [29,  30]  to  the  DWT,  and  provides  more  numerical  examples. 

A.  Filtering  using  DSTFT 

Let  Pa(rn}  be  the  noiseless  projection,  and  Pn{m)  be  the  noisy  projection  for  a  given  view  angle 
no.  To  filter  p,,(m)  using  DSTFT,  we  first  multiply  its  DSTFT  P,,{l,m)  with  a  one-zero  mask. 
The  mask  windows  the  DSTFT  of  Pnini)  to  zero  in  regions  of  the  time-frequency  plane  ((m,  /) 
plane)  where  the  signal  energy  is  much  smaller  than  the  noise  energy,  and  does  not  alter  the 
the  DSTFT  of  p,,(m)  elsewhere.  Thus  we  have 


P(/,m)  =  P„{/,m)Z(/,m)  (17) 

where  the  one-zero  mask  Z(/,  m)  is  either  estimated  from  DSTFT  of  the  data  using  thresholding, 
or  is  given  to  us  as  a  priori  information.  Note  that  this  can  be  viewed  as  a  crude  form  of  Wiener 
filtering,  but  in  the  time-frequency  plane  rather  than  just  the  frequency  domain. 

After  P{l,m)  is  obtained,  the  second  and  final  step  is  the  computation  of  the  filtered  pro¬ 
jection  p(m)  whose  DSTFT  is  P{l,m).  Since  DSTFT  is  redundant,  not  every  2-D  sequence 
P{l,m)  is  a  valid  DSTFT,  i.e.,  there  may  not  be  any  sequence  p(m)  whose  DSTFT  is  P{l,m). 
We  therefore  find  the  signal  p(m)  whose  DSTFT  best  approximates  P{1,  m)  in  the  mean-square 
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error  sense  [19].  It  has  been  shown  [20]  that  if  the  length  of  the  anah'sis  filter  is  not  Ionite; 
than  the  transiorm  length  £,  then  the  minimum  mean-square  error  solution  coincides  with  the 
synthesis  equation  (6).  hence  we  use  (6)  in  the  examples  below. 

To  obtain  the  filtered  image  jj.{I ,  J).  we  perform  the  above  procedure  to  compute  tiie  filtered 
projection  for  each  view  angle,  and  then  proceed  with  the  inverse  Radon  transform  formula. 

B.  Filtering  using  DWT 

To  filter  Pr,(m)  using  DWT,  we  first  multiply  the  DWT  of  with  a  mask,  similarly  to  (17): 

W^ipini)  =  W2ip^{m)Z{l,m)  (IS) 

where  the  mask  Z{l,m)  again  filters  the  projection  in  (m,/)  plane.  We  then  compute  the 
inverse  wavelet  transform  to  find  the  filtered  projection.  In  contrast  to  DSTFT,  the  DWT  is 
not  redundant;  hence  the  filtered  sequence  W2ip(m)  is  always  a  valid  DWT,  and  there  is  no 
need  to  compute  a  minimum  mean-square  error  solution. 

C.  Examples  and  Discussion 

We  now  apply  the  procedures  discussed  above  to  a  frequently-used  phantom  in  medical  imag¬ 
ing.  Our  phantom  pa{Fd)  (shown  in  Figure  1)  looks  like  the  Shepp-Logan  phantom  of  [25]; 
however,  the  gray  levels  have  been  modified  as  in  [26].  Reconstructions  are  performed  from 
noisy  projections  with  128  radial  samples  and  128  angular  samples.  The  additive  noise  is  zero- 
mean,  white,  and  Gaussian,  with  variance  6.25  x  10““*.  Using  FBP  and  the  time  invariant  filter 
Q{w)  from  [25],  with  various  cutoff  frequencies,  the  “best”  (in  the  sense  of  subjective  human 
observation)  reconstruction  from  noisy  data  is  given  in  Figure  2. 

For  filtering  using  DSTFT,  we  determine  our  mask  from  the  noiseless  data  as 


Z{l,m)  = 


1  if  |Pa(/,m)|2  >  i/j 
0  otherwise. 


(19) 


where  i/i  is  a  fixed  threshold.  VVe  choose  our  analysis  filter  to  be  Gaussian  with  length  32  (i.e., 
a(m)  =  m  =  — 16,  — 15, . . .  ,  15;  a(m)  =  0  otherwise).  The  transform  size  C  =  32. 

Using  DSTFT  with  various  thresholds  Ux,  the  “best”  filtered  image  is  shown  in  Figure 

3. 

For  filtering  using  DWT,  we  determine  the  mask  from  the  noiseless  data  as 


Z(/,  m) 


1  if \W2iPa{Tn)\^  >  1/2 
0  otherwise, 


(20) 


where  1/2  is  again  a  fixed  threshold.  We  use  the  D18  wavelet  described  in  [22].  Using  DWT 
with  various  thresholds  izj,  the  “best”  filtered  image  is  shown  in  Figure  4. 

Comparing  Figures  2,  3  and  4,  we  see  that  edges  are  relatively  well-preserved  in  Figures  3 
and  4  (for  example,  the  edges  of  the  skull  are  widened  in  Figure  2  due  to  low-pass  filtering, 
but  not  in  Figures  3  and  4).  At  the  same  time,  the  noise  level  is  reduced  in  Figures  3  and  4, 
and  image  features  are  more  easily  distinguished  from  the  background.  The  percent  root  mean 
square  error,  Ermi  =  (100%)||/i  —  ^alla/H/ialb  is  15.76%,  11.44%,  and  11.61%  respectively  for 
Figures  2,  3,  and  4,  which  reflects  the  improvement  in  noise  level.  It  should  be  remembered, 
however,  that  a  lower  Erma  does  not  necessarily  mean  a  better  image  for  human  observation, 
and  Erma  is  only  a  secondary  criterion. 

Figures  5  and  6  show  the  analogous  results  when  the  mask  is  determined  from  the  noisy  data. 
We  used  a  fixed  threshold  to  obtain  Figure  5,  and  a  scale-dependent  threshold  to  obtain  Figure 
6.  The  results  are  somewhat  worse  than  Figures  3  and  4.  However,  note  that  the  reconstructed 
image  can  be  reprojected  and  time-frequency  thresholding  applied  to  reprojections.  This  results 
in  an  iterative  algorithm  that  generalizes  iterated  Wiener  filtered  image  reconstruction  [31] 
from  frequency  domain  to  time-frequency  plane.  The  results  of  iterated  time-frequency  image 
reconstruction  will  be  presented  elsewhere. 


m 


9 


IV.  IMAGE  CONSTRAINTS 


In  Sections  V  and  VI,  we  consider  the  problem  of  reconstructing  an  image  from  its  pro- 
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jections.  given  the  constraint  that  some  fine-scale  wavelet  transform  values  around  a  region  A 
of  the  image  are  zero.  Since  fine-scale  wavelet  transform  components  represent  localized  high- 
resolution  features  of  the  image,  this  constraint  means  that  A  represents  a  flat  or  slowly- varying 
part  of  the  image,  or  that  A  is  free  of  edges.  Constraining  fine-scale  wavelet  coefficients  in  A 
to  be  zero  effectively  smoothes  the  image,  much  as  low-pass  filtering  does.  The  advantage  ot 
using  wavelets  is  that  this  can  be  done  on  a  localized  basis,  smoothing  some  areas  while  leaving 
other  areas  (such  as  edges)  unaffected. 

Since  the  Radon  transform  is  not  unitary,  knowledge  about  one  part  of  the  image  will 
improve  the  quality  of  the  overall  reconstructed  image.  Numerical  examples  will  show  that  the 
constraints  improve  the  reconstruction  of  the  entire  image,  not  just  the  constrained  region. 

The  constraints  may  either  be  given  as  a  priori  information  (for  example,  it  may  be  known 
that  A  is  free  of  edges),  or  we  may  impose  the  constraints  after  thresholding  the  absolute  value 
of  the  wavelet  transform  of  the  reconstructed  image.  As  mentioned  in  the  introduction,  the 
idea  of  thresholding  the  time-frequency  representation  of  a  signal  for  spatially-varying  filtering 
has  been  previously  used  in  the  literature  [13].  Below,  we  supply  a  statistical  justification  for 
thresholding  the  absolute  value  of  DWT. 

Detection  problem  formulation: 

Assume  that  p{I,J)  is  a  zero-mean  white  Gaussian  random  sequence  with  power  spectral 
density  cr^.  Let  pL[I  j  =  l,...,oo,  2r  =  1,2,  3  be  its  wavelet  transform  defined  using 
(13).  Then,  by  the  orthonormality  of  the  wavelets  (15),  the  quadruply  indexed  random  sequence 
p,(I ,  J)  is  uncorrelated  and  zero-mean,  with  variance  cr^.  To  obtain  a  random  sequence 
p{I,J)  yf  whose  wavelet  coefficients  are  zero  with  probability  one  outside  a  region  Dj, 

we  define  (compare  to  (16)) 

/i(/,J)=  9i'\2‘k,-  I,2‘k2-J)W^f^pL{h,k2)  (21) 

l,ki  ,k2,z€Di 

We  now  state  the  problem.  Given  the  noisy  observations 

/.,(/,  J)  =  ;2(/,J)  + 7,(7,  J)  (22) 
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of  where  tj{I,J)  is  a  zero-mean  white  Gaussian  noise  sequence  with  power  spectra! 

density  determine  the  region  D\. 

Detection  problem  solution 

The  wavelet  transform  of  is 


( 

J)  +  J)  if  (/,  /.  J,  r)  e  D, 

,J)  otherwise, 


(■23) 


where  t]{I ,  J)  is  the  wavelet  transform  of  Note  that  is  a  zero-mean 

uncorrelated  Gaussian  random  sequence  whose  variance  is  +  for  (/,  /,  J,  z)  inside  Di  and 
for  {l,I,J,z)  outside  D^.  Therefore,  the  decision  of  whether  a  point  {l,I,J,z)  €  Di  decouples 
from  similar  decisions  for  other  points.  Furthermore,  for  each  point  in  Z'*,  this  becomes  the 
well-known  problem  of  detection  of  a  Gaussian  random  variable  in  Gaussian  noise.  Its  solution 
is  the  likelihood  ratio  test  [23] 

€  Di 

i  (24) 

iD, 

where  y  is  the  threshold  (to  be  determined).  This  means  that  we  can  decide  whether  (/,  /,  J,  -) 
is  in  Di  simply  by  thresholding  the  absolute  value  of  the  wavelet  transform  of  the  noisy  image 
at  that  point. 

The  threshold  u  can  be  determined  using  the  Neyman-Pearson  criterion.  The  false  alarm 
probability  /V=[probability  of  saying  that  (/,  /,  7,  z)  £  Di  when  it  is  not]  is  equated  to  the  level 
of  significance  a,  resulting  in  the  following  formula  for  v: 


25) 


=  cr,,erfc  '(a/2);  erfc(x)  =  j 
The  detection  probability  Pd=[  probability  of  correctly  detecting  that  {1,1,  J,z)  €  Di]  is  then 


Pd  —  2erfc 


(26) 


When  the  whiteness  or  orthonormality  assumptions  are  relaxed,  the  threshold  test  described 
above  is  no  longer  guaranteed  to  be  optimal.  However,  the  threshold  test  still  seems  to  be  the 
’’natural”  approach. 


# 


# 


m 
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In  practice,  vve  will  wish  to  be  quite  sure  about  thresholding  wavelet  coefficients  to  zerc). 
since  these  will  be  used  as  constraints  in  reconstructing  the  image  from  its  noisv  projections. 
Thus.  Pd  ~  1  and  the  threshold  v  will  be  very  small.  This  has  the  additional  advantage  of 
requiring  fewer  wavelet  coefficients  to  be  constrained  to  zero  in  the  algorithm  of  Sections  \’  and 
\d. 


\'.  IM,4GE  C0NSTR.4INTS-DETERMINISTIG  .VPPRO.VCH 

To  facilitate  the  derivation  of  our  algorithm,  we  assume  in  Sections  V  and  VI  that  the 
projections  are  complete,  i.e.,  we  have  p{r,0)  for  all  r  and  6.  An  explicit  derivation  for  the 
discrete  problem  (sampled  projections)  is  available  in  [32],  but  leads  to  results  virtually  identical 
to  the  continuous  caise. 

The  problem  that  we  address  in  this  section  is  defined  as  follows.  Given  projections  p{r,  9)  of 
an  image  Pai^,  y),  and  given  that  the  wavelet  transform  Pa{I,  J)  with  respect  to  Z  (Z  <  3; 
see  (12))  different  sub-wavelets  is  zero  for  several  values  of  I,  J  on  L  different  scales,  perturb 
the  projections  p(r,  0)  such  that: 

1.  The  image  reconstructed  from  the  perturbed  projections  satisfies  the  wavelet  constraints: 

2.  The  distance  between  the  projections  and  the  perturbed  projections  is  minimized. 

For  clarity  of  presentation,  in  Subsection  V.A  we  consider  the  ca.se  where  only  one  sub¬ 
wavelet  is  involved  and  the  distance  measure  is  the  Euclidean  distance.  We  generalize  to  more 
than  one  sub-wavelet,  and  a  more  general  distance  measure,  in  Subsection  V.B. 

A.  Constraints  on  a  Single  Wavelet 

Let  p{r,9)  be  the  given  projections,  p{r,9)  be  the  perturbed  projections,  and  let  p{I,J)  and 
be  their  inverse  Radon  transforms,  respectively.  Also,  define  the  differences  8p{r,6)  = 
^)~p(e,  9)  and  6p{I,  J)  =  p{I,  —  J)  and  define  the  inner  product  of  two  real  functions 


Pi[r,d)  and  p‘2{r,9)  in  the  projection  space  as 

<  P\{r,9),P2{r,9)  >=  j  f  px{r,9)p2[r,e)drde .  (27) 

JQ  J  —CO 

Using  Parseval’s  theorem,  this  inner  product  can  be  written  in  frequency  domain  as 

<  Px{w,9),P2{w,9)  >— ^  [  f  Px[w,9)P:^{w.9)dwd9  (2S) 

^0  J  —  oo 

where  Px{w,9)  and  P2{w,9)  denote  the  continuous  Fourier  transforms  of  px{r,9)  and  p2[r,9), 
respectively,  and  ’  denotes  complex  conjugate. 

We  assume  that  the  wavelet  transform  of  pa{PJ)  with  respect  to  the  first  wavelet  is 
known  to  be  zero  at  C{1)  points  at  scale  /,  where  I  <  I  <  L.  This  knowledge  could  com.e  either 
from  a  priori  information  about  the  image,  e.g.,  known  absence  of  edges  and  sharp  features 
from  physiological  knowledge,  or  from  thresholding  the  absolute  value  of  the  wavelet  transform 
of  the  image  as  discussed  in  Section  IV.  We  therefore  constrain  the  wavelet  transform  of  p(i,  y) 
to  be  zero  at  those  points.  We  index  each  of  these  points  by  a  pair  (c,  /),  where  /  denotes  the 
scale  and  c  enumerates  the  points  at  each  scale.  The  constraints  can  then  be  written  as 

^2‘^Pa{lc,l,jc,l)  =  =  0,  1<1<L,  l<c<C{l).  (29) 


m 


The  problem  is  to  determine  p(r,  0)  (or  its  inverse  Radon  transform  fl{I,J))  such  that: 


1.  The  constraints  (29)  are  satisfied;  and 


2.  The  induced  norm  /Q^/^(5p(r,^))^drd0  is  minimized. 

This  problem  can  be  solved  as  a  norm  minimization  problem.  Taking  the  wavelet  transform 
of  6p{I,  J)  =  p(/,  J)  —  p(/,  J)  using  (13),  the  constraints  can  be  written  as 


(30) 


I  J 


Since  p{I,J)  can  be  computed  from  the  given  projections  p(r,  0),  p,{ic,i,jc,i)  is  known. 
Our  goal  is  to  compute  Sp{I,  J),  from  which  the  perturbed  image  can  be  computed  as 


(31) 
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Using  the  sampled  inverse  Radon  transform  formula  (3).  (30)  can  be  written  as 

—  r  6P{w.d)Q{w)G\^\w,9)exp[jw2‘(i,jcose  +  j,j  sm  B)]diud9  =  lU^V’/d 7.. '  '32 
4"  Jo  7-^ 

where  G\^\w.9)  is  defined  as 


=  ^]^(7|'*(/.>>^)exp[-ju>(/cos6>  +  J  sinfl)]. 


/  J 


(331 


Eq.  (32)  can  in  turn  be  written  as  an  inner  product  using  (28): 

<  6P{iu,9),^Q{w)G’‘i^^\w,9)exp\-]w2^{ic,icos9  +  jcis'mB)]  >=  jc.i)- 

27r 

Using  the  projection  theorem,  the  minimum  norm  solution  is  found  to  be 

1  ^ 

6P{w,e)  =  5^EEA.  iQ{w)Gi\w,  9)  exp[—jw2‘{ic,i  cos  9  +  jc,i  sin  0)] 

2^  /=t  c=i 

where  the  /3c,i  are  computed  by  solving  the  matrix  equation 

MU  =  6. 


(34) 


(35) 


(36) 


In  (36),  (3  and  6  are  vectors  which  contain  the  unknowns  Uc,t  and  the  knowns 


U  =  [Ul.l^  •  •  •  1  Uc{l),l,  Ul,2i  ■  ■  ■  1  0C{2),2i  ■  •  •  •>  0C{L).l]'^ 

^2^2  V(*C(2),2,  JC(2).2),  •  •  •  fJ‘{ic(L),L,jc{L),L)]^- 


(37; 


The  system  matrix  M  consists  of  submatrices  Mi^m-,  each  of  which  contains  the  inner 
product  of  the  terms  which  are  at  the  right-hand  side  of  the  inner  product  in  (34).  That  is, 


Mi.i 

M,,2 

•••  Mr,L 

• 

M  = 

M2,1 

M2,2 

■"  M2,L 

Ml,i 

Ml, 2 

•••  Ml,l 

(38) 
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where  the  entry  of  is 


<  &)  expl—jw2‘(iu.i  cos  &  +  ju,i  sin  9)], 

ZTT 

;^(5(u;)G^^^^(u;,0)exp[-jii;2”‘(z„,^  cos  0 sin  ^)]  >  . 

ZTT 


(39) 


Writing  the  inner  product  (39)  as  an  integral,  expanding  and  G]^^\w,9)  using 

(33),  and  defining  R{I,J)  =  {g{r)} ,  we  find  that  the  (u,t;)*^  entry  of  Mi,m  is  given  by 

^("Sli  ^l)5m^('S2)  ^2)'R(2"*f«,m  ”  2^tu,;  +  5l  —  S2,  —  ^2)-  (40) 

ti  32  t2 

The  inverse  Fourier  transform  of  6P{w,  9)  is  computed  by  finding  the  sampled  inverse  Radon 
transform  of  (35),  which  is 


L  C(l) 


6MI.J)  =  E  E  -  J), 


(41) 


1  =  1  C=1 


where  is  defined  in  terms  of  the  convolution  of  and  R 


9i 


as 


S'!' V.  -')  =  E  E  S'!‘’(“1 


,(1) 


27r 


(42) 


u 


Thus,  to  compute  the  perturbed  (filtered)  image,  we  first  compute  6^(1,  J)  using  (41),  where 
l3c,i  are  solved  from  the  linear  system  of  equations  (36),  and  is  given  by  (42).  Then,  the 
perturbed  image  is  found  using  (31). 

Note  that  if  the  wavelet  g{I)  is  a  finite-length  sequence,  a  constraint  on  the  wavelet  transform 
of  the  image  will  involve  only  a  finite  number  of  pixel  values.  However,  the  perturbed  image  will 
be  improved  for  all  I  and  J,  since  the  inverse  Radon  transform  '7^7^{9(r)}  (and  thus 
computed  using  (41))  will  have  infinite  extent  for  a  general  choice  of  the  filter  Q{w).  This  is 
illustrated  in  the  numerical  examples  of  Section  VII. 


B.  Constraints  on  Several  Sub- Wavelets 


We  now  generalize  to  more  than  one  sub-wavelet,  and  to  a  more  general  distance  measure.  Let 
Z  <  3  be  the  maximum  number  of  wavelets  used,  and  let  2  €  1, . . . ,  Z  denote  the  sub- wavelet  # 
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number  (see  (12)).  We  index  points  by  the  superscript  r  and  subscripts  (c. /),  where  I  denoit 
the  scale  and  c  enumerates  the  points  at  each  scale.  The  constraints  are 


=  1<1<L,  l<c<C(/),  1<  =  <Z. 

We  also  now  redefine  the  inner  product  of  two  functions  Pi(w.6)  P2[w,9)  as 


431 


<PAw,e),P2iwj)>=^£ 


Pi{iu,  6)P2{w,  0) 

T>) 


dwdO 


(44) 


where  T{w)  is  the  Fourier  transform  of  a  weighting  function  t{r).  T{w)  can  be  any  real,  even 
and  positive  function  of  w. 

The  inner  product  (44)  induces  a  generalized,  weighted  distance,  (compare  to 
/o’" /f^(<^p(r,  which  allows  the  projections  p{r,0)  to  be  weighted  non-uniformly  in 

r.  This  is  useful  in  situations  where  some  projections  are  known  more  reliably  than  others; 
such  situations  can  arise  in  many  different  ways  in  collecting  medical  imaging  data.  For  exam¬ 
ple,  if  the  projections  are  known  to  be  noisier  for  high  |iu|,  due  to  the  limited  radial  resolution 
in  a  rotating  CAT-scan,  these  values  can  be  assigned  less  weight  in  minimizing  the  perturbation 
of  the  projections  (i.e.,  they  can  be  perturbed  more  without  much  increase  in  penalty). 

The  arguments  used  in  the  previous  subsection  lead  to  the  formula 

z  L  C(l) 

6^(1,  =  -I.i' 

2=1  1=1  C=1 


where 


and 


1 


(A 


27r 


«i  <1 


Rt{I ,  J)  =  71/{<7  ♦  f}  =  71/  q{^)t{r  -  J  • 


(45) 

~  ^l)) 

(46) 

(47) 

The  are  again  computed  by  solving  M/d  =  b,  where 


M  = 


M2,1 

Ml, 2  ■ 

M2, 2  • 

•  Mi,l 

•  M2,L 

M,,m{lA)  ■ 

•  Ml, mil,  Z) 

Ml,2  • 

•  M,,m{Z,Z) 

(48) 
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and  where  the  (u,t;)‘^  entry  of  Z2)  is 


n  n  >2  t2 


VI.  IMAGE  CONSTRAINTS-STOCHASTIC  APPROACH 

In  this  section,  we  address  the  following  stochastic  version  of  the  problem  addressed  in 
Section  V:  Suppose  that  we  are  given  noisy  observations  pr,{r,9)  =  pa(r,6)  +  p{r,6)  of  the 
projections  pa{r,6)  of  an  actual  image  paiz:,y),  where  T]{r,d)  is  a  zero-mean  Gaussian  random 
process  in  r  for  each  6,  and  is  uncorrelated  in  6.  Given  constraints  on  the  wavelet  transform 
values  of  the  image,  find  the  image  p{I,J)  such  that: 

1.  p{I,J)  satisfies  the  constrained  image  values;  and 

2-  E{J2i  Ej  J)  -  J)f}  is  minimized. 

As  in  the  previous  section,  we  first  consider  constraints  on  a  single  sub-wavelet.  Also,  to 
draw  a  parallel  with  Subsection  V.A,  we  assume  in  Subsection  VI. A  that  the  additive  noise  in 
the  projections  is  white  in  each  slice  in  r,  i.e.,  E[r]{ri,  9i)7]{r2, 02)]  =  S{ri  —  r2)S{9i  —  02)-  We 
generalize  to  non-white  noise,  and  several  sub-wavelets,  in  Subsection  VI. B. 


A.  Constraints  on  a  Single  Wavelet 

Let  priiCJ)  =  so  that 

Pr,{I,J)  =  Pa{I,J)  +  e{CJ).  (50) 


To  solve  the  problem,  we  first  compute  an  estimate  c(/,  J)  of  e(/,  J)  using  the  given  con¬ 
straints,  and  then  we  subtract  the  noise  estimates  from  the  noisy  image 

Let  the  constraints  be  as  given  in  (29).  Then,  by  taking  the  wavelet  transform  of  both  sides 
of  (50),  we  find  (compare  to  (30)) 


(5!) 


As  in  (30).  the  can  be  computed  from  the  given  projections  p^[r.O).  so  tin 

ll  jc./)  are  known.  Our  goal  is  to  compute  e{l,J),  from  which  the  image  fnI.Ji 

can  be  computed  as  (compare  to  (31)) 

/}(/../)  =  //.,(/,  J)-e(/,J).  (52i 

Since  r](r,0)  is  zero-mean  and  jointly  Gaussian  in  r  and  6,  e[I,J)  (which  is  a  linear  com¬ 
bination  of  T]{r,9))  is  also  zero-mean  and  jointly  Gaussian  in  I  and  J.  The  solution  to  the 
problem  of  finding  e(/,  J)  is  the  linear  minimum  mean-square  (LMMSE)  estimate 

AD. 


L  C{1) 

/=1  C=1 


(53) 


where  the  /9c,/  are  computed  by  solving  the  matrix  equation  (compare  to  (36)) 


M§_=h.  (54) 

.A.S  in  (36),  ^  and  b  are  vectors  which  contain  the  unknowns  /9c,/  and  the  knowns  jc,i)- 

The  system  matrix  M  is  given  by  (38),  where  the  {u,vy^  entry  of  M.prn  is 

(tv,m)  Jv.m  )]  •  (^'^) 

If  the  noise  in  the  projections  is  white  in  r,  i.e.,  if  E[T]{ri,9i)T]{r2,  ^2)]  =  <^(fi  —  r2)6{6i  —  O2), 
then  it  can  easily  be  shown  [24]  that  the  autocorrelation  of  e(/,  J)  is  given  by 

«,(/,  J)  =  £[£(/',  J'W  +r,J  +  ./'))  =  Ar;'  {,(r)}  =  Ar(7,  j),  (56) 

As  a  consequence,  we  find  that 

£[€(/,  -  J) 


(57) 


and 


»1  tl  *2  h 

9i^\si,ti)gl^\s2,t2)R{2”'iv,m  -  2'i„,/  +  si  -  52,2”‘;v,m  -  2';„,/  +  -  <2), 


(58) 


which  is  the  same  as  (40). 

We  thus  find  that  e(/,  J)  computed  using  (53)  coincides  with  8g.{I,  J)  computed  using  (41). 
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B.  Constraints  on  Several  Sub-Wavelets 


We  now  assume  that  the  constraints  are  given  by  (43),  and  that  the  autocorrelation  of  the 
additive  noise  in  the  projections  is  no  longer  white,  but  is  given  by 

E[T){ru6i)r}{r2,02)]  =  t{ri  -  r2)S{6i  -  02).  (59) 

That  is,  the  noise  is  still  uncorrelated  between  projection  angles,  but  correlated  in  r  at  a  single 
angle.  Then,  using  the  arguments  of  the  previous  subsection,  it  is  easily  shown  that  the  noise 
estimate  of  the  image  c(/,  J)  is  the  same  as  8fi{I,J)  computed  using  (41),  i.e., 

Z  L  C(l) 

HU)  =  '£'£'£  H  -  j).  (60) 

2=1  /=1  C=1 

where  yt\^\l,  J)  is  given  by  (46)  and  are  computed  by  solving  =  6,  where  M.  is  given 
by  (48)  and  (49). 

C.  Relation  Between  Image  Constraints  and  Projection  Constraints 

The  discretized  inverse  Radon  transform  (4)  can  be  regarded  as  an  operation  of  multiplying  a 
vector  of  projection  data  by  a  matrix,  yielding  a  vector  of  image  pixels.  Similarly,  the  DSTFT 
(5)  and  the  DWT  (12),  along  with  their  inverse  transforms  (6)  and  (16),  can  also  be  regarded 
as  operations  of  multiplying  a  vector  of  transform  values  by  a  matrix^yielding  a  vector  of  image 
pixels  (or  vice-versa). 

As  a  result,  it  is  clear  that  constraining  certain  values  of  the  DWT  of  the  image  is  equivalent 
to  constraining  linear  combinations  of  the  DWT  of  the  projection  data,  and  vice-versa.  That 
is,  the  real  difference  between  Section  III  (in  which  DWTs  of  projection  data  were  constrained) 
and  Section  V  (in  which  DWTs  of  the  image  were  constrained)  is  in  the  specific  constraints 
applied,  and  their  physical  interpretations.  The  basic  underlying  concepts,  with  least-squares 
criteria,  are  identical.  This  unifies  the  two  problems. 


VII.  IMAGE  CONSTRAINTS-EXAMPLES  AND  DISCUSSION 


In  this  section,  we  present  three  numerical  examples  which  illustrate  the  results  of  Sections 
V  and  VI.  Since  the  stochastic  and  deterministic  developments  lead  to  the  same  formula,  our 
examples  will  be  formulated  in  terms  of  the  stochastic  (noisy)  case  only.  The  examples  are  the 
same  as  those  which  appeared  in  [27], 

As  mentioned  in  the  beginning  of  Section  V,  the  algorithms  developed  in  the  previous  two 
sections  were  based  on  complete  projections.  In  practice,  only  samples  p(m,n)  of  projections 
are  available.  This  does  not  present  any  difficulties,  and  the  only  modification  to  our  algorithm 
is  the  replacement  of  (47)  by 


^q(ti)t(m  -  ti) 


(61) 


where  q(m}  is  a  discrete  filter  that  approximates  q(r),  and  t(m)  is  the  radial  autocorrelation 
function  of  the  discrete  noise  in  the  projections. 

Example  1:  The  noiseless  image  used  in  this  example  is  a  disk  of  value  1.00  and  radius  0.81 
in  a  background  of  value  0.  Projections  of  the 'disk  are  computed  over  128  angles  and  128  lines 
in  each  angle.  The  reconstruction  from  noiseless  projections  is  shown  in  Figure  7.  The  noise 
added  to  the  projections  is  obtained  by  parsing  zero-mean  white  Gaussian  noise  with  variance 
0.01  through  a  filter  whose  discrete-time  Fourier  transform  is  (sm(u;))^^.  The  100  x  100  image 
obtained  from  the  noisy  projections  using  FBP  is  shown  in  Figure  8. 

The  wavelets  we  use  are  two  sub-wavelets  of  the  Ha^r  baisis,  which  can  be  regarded  as 
difference  operators  in  the  I  and  J  directions  (the  third  Haar  sub- wavelet,  which  can  be  regarded 
as  a  difference  operator  in  the  diagonal  direction,  is  not  used).  We  constrain  the  two  finest-scale 
wavelet  coefficients  to  be  zero  in  a  15  x  55  rectangular  area  Ao  inside  the  disk.  We  use  (60)  to 
estimate  the  noise  where  the  matrix  A4  is  given  by  (48). 

The  MMSE  image  is  shown  in  Figure  9.  Table  1  shows  the  average  performance  of  our 
procedure  for  10  different  noise  realizations  for  this  example.  The  area  obtained  by  enlarging 
^0  by  20  pixels  in  every  direction  is  denoted  as  Ai]  this  is  roughly  the  area  in  which  we  expect 


improvement,  due  to  the  support  of  the  filter  yt{I,J).  The  whole  image  is  denoted  by  A,- 
From  Table  1,  we  see  that  the  noise  in  Ao  is  almost  completely  eliminated.  We  also  find  that, 
compared  to  the  unprocessed  noisy  image,  the  noise  powers  in  the  regions  Ai  —  Aq  and  At  —  Aq, 
in  which  we  do  not  have  any  wavelet  constraints,  are  reduced  by  13.1%  and  6.4%,  respectively. 
This  shows  that  constraining  wavelet  coefficients  in  a  given  region  improves  the  reconstruction 
in  other  regions.  This  is  because  the  noise  e  in  the  reconstructed  image  is  non-white,  due  to 
the  fact  that  the  Radon  transform  is  non-unitary  and  the  additive  noise  rj  on  the  projections  is 
non-white. 

Example  2:  In  the  second  example,  the  original  image  is  not  flat  in  Aq,  but  varies  smoothly 
in  that  region.  The  noiseless  image,  which  is  the  union  of  a  disk  and  an  exponential,  is  shown 
in  Figure  10.  The  additive  noise  is  the  same  as  in  Example  1,  and  the  noisy  reconstruction  is 
shown  in  Figure  11.  The  wavelet  constraints  used  are  the  same  as  in  Example  1.  Since  the 
noiseless  image  has  small,  but  not  negligible,  high-resolution  components,  we  need  to  use  a 
wavelet  which  has  wider  support  than  the  Haar  wavelet  in  this  example.  Otherwise,  setting  the 
fine-scale  wavelet  coefficients  to  zero  may  cause  “blocking  artifacts”  in  the  image.  The  result 
of  our  algorithm  with  the  six-coefficient  Daubechies  wavelet  [22]  is  shown  in  Figure  12.  Note 
again  that  noise  hcis  been  reduced  not  only  in  the  constrained  region,  but  in  other  regions. 

Example  3:  In  this  example,  we  use  the  original  Shepp-Logan  head  phantom  [25]  as  our 
noiseless  image.  The  autocorrelation  of  the  noise  is  the  same  as  previous  examples,  and  the 
noise  variance  is  4  x  10~®.  The  128  x  128  noisy  image  is  shown  in  Figure  13.  We  use  the  Haar 
wavelet,  and  constrain  the  two  finest-scale  wavelet  coefficients  to  be  zero  over  a  region  D  of  the 
image.  The  region  D  is  obteiined  by  the  thresholding  approach  over  a  region  D'  in  the  center  of 
the  image.  First,  the  second-finest  wavelet  coefficients  inside  D'  are  set  to  zero  whenever  their 
absolute  value  is  below  a  threshold.  The  region  which  will  be  affected  by  the  above  operation 
is  called  D".  Then,  inside  D",  another  threshold  is  used  to  set  the  finest-scale  coefficients  to 
zero.  The  resulting  MMSE  image  is  shown  in  Figure  14.  The  noise  power  has  been  reduced  by 
20.3%  in  the  whole  image,  while  still  preserving  the  edges. 


VIII.  CONCLUSION 


We  have  shown  how  reconstruction  of  CT  images  can  be  improved  by  time-frequency  fil¬ 
tering.  We  have  considered  two  different  filtering  strategies.  The  first  strategy  is  filtering  the 
projection  data  and  then  computing  the  filtered  image  using  filtered  backprojection.  The  sec¬ 
ond  strategy  is  filtering  the  reconstructed  image  directly.  In  the  first  strategy,  the  DSTFT  or 
the  DWT  of  the  projections  are  set  to  zero  in  certain  areas  of  the  time-frequency  plane,  in 
effect  performing  a  spatially-varying  low-pass  filtering  of  the  data.  In  the  second  strategy,  the 
DWT  of  the  reconstructed  image  is  constrained  to  be  zero  in  certain  areas  of  the  time-frequency 
plane,  and  the  reconstructed  images  satisfying  these  constraints  and  minimizing  either  the  de¬ 
terministic  least-squares  perturbation  of  the  projections,  or  the  stochastic  mean-square  error, 
is  computed.  These  criteria  were  then  expanded  to  a  weighted  deterministic  perturbation,  or 
a  stochastic  minimum  mean-square  estimate  in  colored  Gaussian  noise.  Examples  using  both 
schemes  show  that  time-frequency  filtering  can  produce  results  superior  to  linear,  spatially- 
invariant  filtering  for  CT  images. 
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Fig.  1.  Noiseless  Shepp-Logan  phantom 

Fig.  2.  Best  reconstruction  from  noisy  data  and  linear  time- invariant  filter. 

Fig.  3.  The  result  of  filtering  with  DSTFT,  the  mask  is  obtained  from  noiseless  projections. 

Fig.  4.  The  result  of  filtering  with  DWT,  the  mask  is  obtained  from  noiseless  projections. 

Fig.  5.  The  result  of  filtering  with  DSTFT,  the  mask  is  obtained  from  noisy  projections. 

Fig.  6.  The  result  of  filtering  with  DWT,  the  mask  is  obtained  from  noisy  projections. 

Fig.  7.  The  reconstruction  of  the  disk  used  in  Example  1  from  noiseless  projections. 

Fig.  8.  The  noisy  image  for  Example  1,  obtained  from  the  noisy  projections  by  FBP. 

Fig.  9.  The  MMSE  image  obtained  by  constraining  the  two  finest-scale  wavelet  coefficients  in 
Aq  to  0. 

Fig.  10.  The  reconstruction  of  the  image  used  in  Example  2  from  noiseless  projections. 

Fig.  11.  The  noisy  image  for  Example  2,  obtained  from  the  noisy  projections  by  FBP. 

Fig.  12.  The  MMSE  image  obtained  by  constraining  the  two  finest-scale  wavelet  coefficients  in 
Aq  to  0;  the  wavelet  basis  function  is  the  6  coefficient  Daubechies  wavelet;  Example  2. 

Fig.  13.  The  noisy  image  for  Example  3,  obtained  from  the  noisy  projections  of  the  Shepp- 
Logan  phantom. 

Fig.  14.  MMSE  image  obtained  by  constraining  the  wavelet  coefficients  of  the  noisy  image; 
Example  3. 
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Abstract —  We  compare  the  performance  of  the  eigenimage 
filter  to  that  of  several  other  filters,  applied  to  magnetic  reso¬ 
nance  image  (MRI)  scene  sequences  for  images  enhancement  and 
segmentation.  Comparisons  are  made  with  principal  component 
analysis,  matched,  modified-matched.  maximum  contrast,  target 
point,  ratio,  log-ratio,  and  angle  image  filters.  Signal-to-noise  ratio 
(SNR),  contrast-to-noise  ratio  (CNR),  segmentation  of  a  desired 
feature  (SDF),  and  correction  for  partial  volume  averaging  ef¬ 
fects  (CPV)  are  used  as  performance  measures.  For  comparison, 
analytical  expressions  for  SNRs  and  CNRs  of  filtered  images 
are  derived,  and  CPV  by  a  linear  filter  is  studied.  Properties 
of  filters  are  illustrated  through  their  applications  to  simulated 
and  acquired  MRI  sequences  of  a  phantom  study  and  a  clinical 
case;  advantages  and  weaknesses  are  discussed.  Our  conclusion  is 
that  the  eigenimage  filter  is  the  optimal  linear  filter  that  achieves 
SDF  and  CPV  simultaneously. 

I.  Introduction 

CONTRAST  in  MR  images  depends  on  at  least  five  major 
intrinsic  tissue  parameters:  proton  density  (N(H));  spin- 
lattice  (Tl)  and  spin-spin  (T2)  relaxation  times;  flow  velocity 
(i/);  and  chemical  shift  (<5).  It  also  depends  on  four  parameters 
of  the  pulse  sequence:  repetition  time  (TR);  echo  time  (TE); 
inversion  time  (TI);  and  flip  angle  {6).  Hence,  images  are  often 
difficult  to  interpret,  and  the  observer  must  extract  relevant 
information  from  all  images  in  the  MRI  scene  sequence. 
A  variety  of  filters  have  been  developed  to  proceed  with 
this  difficult  task  of  picture  interpretation.  A  transformation 
(filtration)  can  be  viewed  as  an  information  rearrangement  of 
data,  so  that  the  information  in  the  transformed  domain  is 
easier  to  visualize  and  interpret.  Some  transformations  achieve 
data  reduction  by  removing  extraneous  information,  such  as 
interfering  features  or  noise.  Hence,  they  can  be  used  for 
contrast  enhancement  and  feature  extraction,  as  well  as  for  data 
compression.  We  proceed  with  a  brief  review  of  some  well- 
known  transformations  applicable  to  MRI  scene  sequences 
(some  details  are  given  in  Sections  V-A,  -B,  and  -C,  in  the 
Appendix). 
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.4.  Review  of  Linear  Transformations 

A  linear  transformation  which  has  been  applied  in  a  variety 
of  fields  [1]-[16],  including  MRI  [14]-[16].  is  principal  com¬ 
ponent  analysis  (PCA).  PCA  is  a  method  of  statistical  analysis 
which  has  been  employed  in  digital  image  processing  as  a 
technique  for  image  coding,  compression,  enhancement,  and 
feature  extraction  [6] -[9],  This  filter  gives  linear  combinations 
of  the  images  which  maximize  the  variance  over  a  region  of 
interest  (ROI).  The  process  also  reduces  the  dimensionality 
of  the  useful  data  space.  The  first  PCA  image  is  the  linear 
combination  that  achieves  maximum  global  signal-to-noise 
ratio  (GSNR)  [17], 

Sometimes,  maximizing  the  signal-to-noise  ratio  (SNR)  of 
only  a  particular  feature  is  of  interest.  In  this  situation,  a 
matched  filter  is  optimum.  The  matched  fiber  was  originally 
derived  as  the  optimal  linear  filter  for  maximizing  the  output 
SNR  [18],  It  has  also  been  used  as  the  optimum  receiver 
for  detection  of  a  known  signal  in  white  noise  [19],  and  for 
medical  image  enhancement  [20],  The  basic  matched  filter 
does  not  remove  any  features  from  the  scene.  Its  modified 
version  {modified-matched  filter)  [21]-[23]  removes  any  con¬ 
stant  (bias)  feature,  at  the  expense  of  a  decrease  in  SNR  of 
the  desired  feature  as  compared  to  the  basic  matched  filter. 

In  many  applications,  there  are  non-constant  features  which 
interfere  with  the  observation  of  the  desired  object.  To  remove 
these  interfering  features  at  the  lowest  possible  cost  (least 
decrease  in  SNR  of  the  desired  feature),  the  eigenimage  filter 
has  been  developed  [24]-[28].  The  eigenimage  filter  was 
originally  derived  as  the  linear  filter  which  maximizes  the 
ratio  of  a  desired  feature  energy  to  one  or  more  undesired 
(interfering)  features  energies  in  a  composite  image  called  the 
eigenimage  [24] -[26],  We  have  recently  found  that  this  filter 
maximizes  the  SNR  of  the  projection  of  a  desired  feature 
while  suppressing  the  projections  of  interfering  features  in 
the  eigenimage.  Moreover,  as  we  prove  in  Section  V-E  in 
the  Appendix,  it  has  the  advantage  of  correcting  for  partial 
volume  averaging  effects  (CPV). 

In  some  applications,  maximization  of  the  contrast-to-noise 
ratio  (CNR)  between  two  features  is  desirable.  In  this  case,  the 
maximum  contrast  {difference)  filter  is  the  optimum  solution 
[29],  The  maximum  contrast  filter  was  originally  derived  for 
detecting  one  of  two  known  signals  in  white  noise  [19],  There 
is  no  unique  extension  of  the  procedure  to  the  general  case 
of  multiple  interfering  features  in  the  scene.  One  possibility 
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,5  to  use  PCA.  Another  possibility  is  to  use  the  linear  filter 
which  provides  the  largest  value  for  the  minimum  absolute 
CSR  (max.  min.  A'XR:)  between  a  desired  feature  and 
multiple  interfering  processes  [30|.  This  requires  a  search 
amons  several  possibilities  to  find  the  optimal  filter;  we  refer 
to  this  filter  as  maximized  minimum  absolute  CNR  (MMAC). 

B.  Review  of  Sonlinear  Transformations 

The  nonlinear  transformation  which  maximizes  the  CNR 
between  an  ROl  and  multiple  interfering  features  is  the  target 
point  image  method  (TRIM)  [31].  In  this  approach,  the  filtered 
imaae  is  formed  by  calculating  the  Euclidean  distance  between 
each  pixel  vector  and  a  target  signature  vector  which  is 
modeled  or  estimated.  TRIM  is  analogous  to  using  a  separate 
matched  filter  for  each  location-dependent  contrast  vector;  this 
maximizes  CNR  between  the  target  region  and  each  pixel.  In 
a  target  point  image  (TPI),  the  values  assigned  to  the  pixels 
in  the  target  region  are  near  zero,  while  all  other  pixels  in 
the  image  have  larger  values.  The  intensities  in  a  TPI  can  be 
inverted,  so  that  the  target  region  appears  white  instead  of 
black,  consistent  with  the  output  of  a  linear  filter.  We  refer  to 
this  inverted  image  as  an  inverted  target  point  image  (ITPI). 

The  ratio  filter  is  another  nonlinear  filter  which  has  pre¬ 
viously  been  used  for  raultispectral  image  enhancement  in 
the  field  of  remote  sensing  [32]-[34].  In  many  multispectral 
imaging  systems  the  image  can  be  modeled  by  the  product 
of  an  object  reflectivity  function  and  an  illumination  function 
which  is  almost  identical  for  all  multispectral  images.  Dividing 
two  such  images  provides  an  automatic  normalization  or 
compensation  of  the  illumination  factor.  The  same  idea  is 
applicable  to  T2-weighted  multiple  spin-echo  MR  images. 
Here,  the  signal  can  be  approximated  by  a  random  variable 
multiplied  by  a  damping  exponential  (T2  decay).  It  is  known 
that  intrinsic  parameters  (N(H),  Tl,  and  T2)  of  a  specific 
tissue  have  random  distributions.  In  a  ratio  image,  most  of  the 
signal  variations  within  a  specific  tissue  are  compensated  (see 
Section  V-C  in  the  Appendix  for  details).  This  can  improve 
the  SNR  and  CNR  of  the  image.  In  general,  the  ratio  image 
between  any  two  images  can  be  defined  as  long  as  the  divisor 
image  is  everywhere  nonzero.  For  an  image  sequence,  the  ratio 
images  can  also  be  computed  with  respect  to  an  average  image; 
this  reduces  the  propagating  noise. 

A  problem  with  the  ratio  filter  is  the  accentuation  of  the 
grav  scale  noise  associated  with  each  image.  This  can  be 
reduced  significantly  by  homomorphic  filtering,  i.e.,  taking 
logarithms  of  the  ratio  images.  This  generates  log-ratio  images 
[32]-[34].  If  there  are  very  small  gray  levels  in  a  ratio  image, 
the  logarithm  function  generates  very  large  negative  numbers, 
resulting  in  a  very  large  dynamic  range  for  the  log-ratio  image. 
This  may  be  avoided  by  taking  the  arctangent  of  a  ratio  image; 
this  produces  an  angle  image  [35]. 

Other  nonlinear  transformations  generate  calculated  or  syn¬ 
thesized  images  [36] -[45].  Proton  density  and  relaxation  time 
images  can  be  calculated  from  a  set  of  acquired  images. 
It  is  also  possible  to  synthesize  a  contrast-optimized  image 
with  optimal  contrast  between  two  tissue  types.  Due  to  the 
propagation  of  noise,  however,  the  conspicuity  of  these  images 


are  usually  no  better  than  the  best  acquired  image  [46].  [29]. 
[14].  Moreover,  their  calculations  involve  solving  nonlinear 
equations  which  are  quite  time  consuming.  Therefore,  do 
not  consider  them  here. 

In  previous  work  little  has  been  done  to  1)  apply  the 
ratio,  log  ratio,  and  angle  image  filters  to  .MRI  images. 
2)  derive  analytical  expressions  for  the  SNR  and  CNR  of 
composite  images  (for  most  of  the  filters)  and  3)  compare  their 
performances  when  applied  to  MRI  scene  sequences.  Such  a 
studv  would  be  helpful  to  those  who  want  to  choose  the  most 
appropriate  filter  for  their  particular  needs. 

C.  Our  Contribution 

New  contributions  of  this  work  are  five-fold:  1)  appli¬ 
cation  of  ratio,  log-ratio,  and  angle  image  filters  to  MRI 
(see  Section  V-C  in  the  Appendix  for  a  rationale  of  this 
application);  2)  specification  of  analytical  expressions  for 
SNR’s  and  CNR’s;  3)  evaluation  of  these  expressions  by 
comparing  computed  and  actual  SNR’s  and  CNR’s  for  all  of 
the  above  filters;  4)  investigation  of  CPV  by  a  linear  filter; 
and  5)  illustration  of  the  filters’  properties  using  simulated 
and  acquired  MR  images. 

In  Section  II,  we  briefly  discuss  our  approach  and  criteria 
for  comparing  composite  images.  In  Section  III,  we  apply 
the  above  filters  to  simulated  and  acquired  MRI  sequences 
of  an  egg  phantom  and  a  human  brain.  In  Section  IV,  we 
compare  the  results  and  discuss  the  advantages  and  weaknesses 
of  each  transformation  for  enhancement,  segmentation,  and 
partial  volume  correction  of  MRI  scene  sequences.  Appendix 
includes;  some  details  of  filters;  rationale  for  the  application 
of  ratio  filter  to  MRI;  methods  of  deriving  SNR  and  CNR 
expressions;  proof  of  correcting  for  partial  volume  averaging 
affects  by  the  eigenimage  filter;  list  of  abbreviations;  and  list 
of  notations. 

II.  Performance  Measures 

In  this  study,  we  use  simulated  and  acquired  MR  images  of 
an  egg  phantom  and  a  human  brain.  Each  sequence  consists 
of  four  T2-weighted  and  one  Tl-weighted  MR  images,  for 
access  to  proton  density,  Tl,  and  T2  information  using  only 
two  acquisitions  in  a  clinically-reasonable  amount  of  time.  We 
use  a  multiple  spin-echo  to  generate  four  T2-weighted  images, 
and  a  single  spin-echo  to  generate  a  Tl-weighted  image.  For 
performance  evaluation,  we  consider  SNR,  CNR,  SDF,  and 
CPV  in  composite  images  generated  by  the  above  filters. 

A.  Signal-to-Noise  Ratio 

For  image  interpretation,  one  is  usually  interested  in  vi¬ 
sualization  of  a  specific  tissue  referred  to  as  the  desired 
feature.  Let  be  the  gray  level  of  the  (j,  A:)th  pixel  in  the 
desired  ROI  (DROI).  Pf^.  consists  of  a  deterministic  value 
plus  statistical  noise;  the  desired  deterministic  value  if  then 
the  mean  E[Pji^],  and  the  strength  of  the  noise  is  the  standard 
deviation  (Var(P]^j.))^/^.  Signal-to-noise  ratio  of  the  desired 
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feature  (SNRd)  is  defined  as 


SNRd  = 


(1) 


Since  statistical  noise  in  MRI  is  ergodic  and  uncorrelated  with 
the  signal  [47]-[48].  E[P'I^^\  and  VartP]'*,)  in  a  homogeneous 
region  of  M  pixels  can  be  estimated  using  the  sample  mean 


and  variance  [49] 

= 

j.k 

/  \  2 

(2) 

Var(F‘i)  = 

^  y  1 

f  S  j  • 

(3) 

M  -  1  ^  ' 

The  estimated  values  E[Pf^]  and  VaT{Pf^)  are  then  inserted 
into  (1).  We  make  the  following  assumptions  which  were  also 
made  in  [23],  [29],  [30],  [31];  1)  statistical  noise  in  MRI  can 
be  modeled  as  a  Gaussian  distributed  zero-mean  white  noise 
field  with  standard  deviation  cr;  and  2)  signature  vectors  are 
a  priori  known  fairly  well.  Then,  the  standard  formula  for 
noise  propagation  [50],  [51]  ((23)-(24)  in  Section  V-D  in  the 
Appendix)  shows  that  SNR  of  the  desired  object  in  a  composite 
image  generated  by  a  linear  filter  is  given  by 

W  d 

SNR,  =  - 3:  (4) 

a{w  •  W)  ' 

where  W  and  d  are  the  weighting  vector  and  the  desired 
signature  vector,  respectively. 

Using  (4)  and  the  analytical  expressions  for  the  weighting 
vectors  (given  in  Section  V-A  in  the  Appendix),  we  have 
derived  the  mathematical  expressions  for  SNR  of  the  desired 
feature  in  linearly  filtered  images.  For  TPI’s  and  ITPI’s,  we 
used  the  fact  that  the  squared  Euclidean  distance  between  a 
pixel  vector  in  the  DROI  and  the  desired  signature  vector  has 
a  chi-squared  distribution  with  n  degrees  of  freedom  where  n 
is  the  number  of  images  in  the  sequence.  Hence,  the  expected 
values  and  standard  deviations  necessary  for  SNR',s  can  be 
determined  (see  Section  V-D  in  the  Appendix  for  details).  For 
ratio,  log-ratio,  and  angle  images,  we  have  derived  mathemat¬ 
ical  expressions  for  SNRjjS  in  the  corresponding  composite 
images  using  the  standard  formula  for  noise  propagation 
(23)-(24).  The  resulting  analytical  expressions  for  all  of  these 
filters  are  summarized  in  the  second  columns  of  Tables  I  and 
II;  details  of  the  deriviations  are  given  in  [52]. 

B.  Contrasi-to-Noise  Ratio 

In  addition  to  SNR,,  the  CNR  between  the  desired  tissue 
and  an  undesired  (interfering)  feature  (background  or  another 
tissue  type)  is  usually  important  for  image  interpretation.  The 
CNR  between  the  desired  feature  and  an  undesired  feature 
(CNRrfu)  is  defined  as  [29] 


CNR,u  = 


E\p;‘, 


Var(P<'J*\'ar(P“J 


(5) 


where  is  the  gray  level  of  the  [j.  /c)th  pixel  in  the  undesired 
ROl  (UROl),  and  E[P^i.]  and  Var(F“^)  are  the  mean  and  the 
variance  of  pixel  values  in  the  UROI.  respectively. 

Making  the  assumptions  and  using  the  techniques  described 
for  SXR,.  we  have  derived  the  CNR,u  for  the  above  filters. 
The  resulting  expressions  are  summarized  in  the  third  and 
fourth  columns  of  Table  1  and  the  third  column  of  Table  11; 
details  of  the  derivations  are  given  in  [52]. 


C.  Segmentation  of  a  Desired  Feature 

If  a  filtered  image  can  be  windowed  (linearly  histogram 
equalized)  so  that  only  the  desired  feature  is  visualized  in 
the  scene,  the  corresponding  filter  has  segmented  the  desired 
feature  (SDF).  Although  SNR,  and  CNR,u  give  quantitative 
measures  of  the  image  quality,  they  do  not  note  whether  SDF 
from  undesired  features  is  achieved.  Therefore,  we  consider 
SDF  in  evaluating  the  composite  image  quality;  complete  SDF 
is  achieved  if  and  only  if  all  interfering  feature  signature 
vectors  are  mapped  into  gray  levels  which  are  all  less  than 
or  all  greater  than  the  gray  level  into  which  the  desired 
signature  vector  is  mapped.  In  this  paper,  we  evaluate  SDF 
by  inspection. 


D.  Correction  for  Partial  Volume  Averaging  Effects 

We  assume  that  the  magnetic  resonance  signal  S  from  a 
voxel  containing  m  different  materials  is  given  by 

1=1  ^  ' 

where  Vj  is  the  volume  of  the  (th  material  within  the  voxel,  V 
is  the  total  volume  of  the  voxel,  and  Si  is  the  signal  from  the 
Zth  material.  This  is  a  reasonable  assumption,  since  the  signal 
from  a  voxel  is  directly  proportional  to  the  net  macroscopic 
magnetization.  The  net  macroscopic  magnetization  is  the  sum 
of  all  of  the  individual  magnetic  moments,  provided  that 
the  frequency  band-width  across  the  voxel  is  larger  than  the 
chemical  shifts  of  the  different  chemical  materials  in  the  voxel. 
Hence,  the  gray  level  Fj*  of  the  (j,  A:)th  pixel  (conesponding 
to  the  (j,  fc)th  voxel)  in  an  MR  image  is  given  by 

F,*  =  F[F,fc]  +  =  f;  5,  -b  (7) 

where  Vijk  is  the  partial  volume  of  the  Ith  material  in  the 
{j,k)\h  voxel,  and  •pjk  represents  statistical  noise  which  is 
again  assumed  to  be  an  additive  zero-mean  white  Gaussian 
noise  field  with  standard  deviation  a,  uncorrelated  with  the 
signal. 

Correction  for  partial  volume  averaging  effects  (CPV)  means 
that  in  the  transformed  image  (TI)  we  obtain 

F[r/,*]=(^)F[T(i^J]  (8) 
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TABLE  I 

Asalytical  E.xpressions  for  S.\R.f,  CXRj„j,  .and  CNRj„2  of  Linear  Filters  (Three  Features  in  the  Scene) 


Filter 


SXRj 


CXR<i„^ 


CXR,„_, 


PC.\; 

Matched 

.Mod -Mat 

Eigen 
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cos (72) 
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where  E\TIjk\  is  the  mean  value  of  the  0,  ^)th  pixel  in  the 
transformed  image,  V^jk  is  the  partial  volume  of  the  desired 
material  in  the  (j',  A:)th  voxel,  and  E[T{Pi^)\  is  the  mean  value 
of  the  (/,m)th  pixel  in  the  desired  ROI  of  the  transformed 
image.  This  correction  is  necessary  for  correct  interpretation 
and  analysis  of  MR  images,  as  well  as  for  volume  calculations. 
Since  none  of  the  previous  factors  (SNR,i,  CNRdU,  and  SDF) 
note  whether  this  correction  is  achieved,  we  consider  it  as  a 
fourth  factor  in  the  evaluation  of  composite  image  quality.  In 
Section  V-E  in  the  Appendix  we  show  that  the  eigenimage 
filter  achieves  CPV,  and  that  none  of  the  other  linear  filters 
listed  above  does  so. 

III.  SlMULAHON  AND  ACQUIRED  IMAGE  RESULTS 

A.  Discussion  of  Analytical  Expressions 

For  a  scene  with  three  features,  we  have  derived  and  listed 
the  SNRjS  and  CNR^„s  of  linear  filters  in  Table  I.  The  SNRrf 
of  the  matched  filter  is  the  highest  in  the  list,  since  all  other 
SNR^s  have  sinusoidal  fat  ts.  The  CNRj„s  of  the  MMAC 


filter  are  the  highest  among  all  CNRj„s  in  the  list.  The 
eigenimage  filter  is  unique  in  that  it  has  equal  SNRd  and 
CNR^„s. 

Analytical  expressions  for  the  SNR'^s  and  CNR'^^^s  of  non¬ 
linear  filters  are  listed  in  Table  H.  Note  that  these  expressions 
can  be  used  for  a  scene  with  arbitrary  number  of  interfering 
features.  Except  for  the  ratio,  log-ratio,  and  angle  images  (com¬ 
puted  using  only  two  images  in  the  sequences),  all  other  SNR’s 
and  CNR’s  monotonicaUy  increase  with  the  number  of  images 
in  the  sequence.  As  expected,  SNR’s  and  CNR’s  of  ratio,  log- 
ratio,  and  angle  images  calculated  with  respect  to  the  average 
image  are  larger  than  the  corresponding  ratios  for  composite 
images  calculated  using  only  two  images  in  the  sequence. 

The  analytical  expressions  in  Tables  I  and  n  can  be  used 
to  estimate  the  SNR^s  and  CNRj„s  of  the  filtered  images 
without  actual  filtering  of  the  image  sequence.  In  addition 
to  the  comparison,  this  provides  the  possibUity  of  optimizing 
MRI  pulse  sequence  parameters  for  each  filter;  we  discuss 
such  an  optimization  procedure  for  the  eigenimage  filter  in  a 
companion  paper  [53]. 
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TABLE  II 

Analytical  Expressions  for  SXRj  and  CNR^u  of  Nonunear  Filters  (Multiple  Fe.atures  in  the  Scene) 


CO' 


Notations  used  in  Tables  I  and  II 


“1 1  The  angle  between  tTi  and  the  constant  vector  c  =  [1,  1,  •  ■  • ,  1]^. 

021  The  angel  between  lij  and  the  constant  vector  c. 

The  angle  between  d  and  the  eigenvector  corresponding  to  the  maximum  eigenvalue  of  the  sample 

covariance  matrix  ( Cm)- 

■  The  angle  between  j  and  Sm . 

^ •  The  angle  between  ^d  —  Uij  and  Sm- 

Q  ,  -m 

The  angle  between  d  and  its  projection  onto  the  subspace  spanned  by  ui  and  U2 . 

'  The  angle  between  d  and  ui . 

^2 :  The  angle  between  d  and  52 . 

The  angle  between  d  and  the  constant  vector  c. 

■  The  angle  between  d  and  ^3  —  Si  j . 

■  The  angle  between  3  and  ^3  —  *2) . 

The  angle  between  ^3 -  Si)  and  (3-52). 
r(  ):  The  gamma  function,  r(<)  =  /“  i  >  0. 

n  :  The  number  of  images  in  the  original  MRI  scene  sequence. 

‘^1-  The  average  gray  level  in  the  DROI  of  the  Ith  original  image  (Ith  element  of  3). 

‘^>3  ■  The  average  gray  level  in  the  DROI  of  the  average  image  (average  of  elements  of  S). 

uf.  The  average  gray  level  in  the  UROI  of  the  Ith  original  image  (Ith  element  of  5). 

Uo :  The  average  gray  level  in  the  UROI  of  the  average  image  (average  of  elements  of  3). 
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5.  Specifications  of  Experiments 

In  order  to  illustrate  the  performance  of  different  trans¬ 
formations  applicable  to  MR  scene  sequences,  we  have  used 
simulated  and  acquired  MRI  sequences.  Each  sequence  con¬ 
sists  of  four  T2-weighted  and  one  Tl-weighted  MR  images. 
Simulations  were  performed  on  a  Vicom  image  processing 
computer.  Acquired  MRI  sequences  of  an  egg  phantom,  an 
aaarose  phantom,  and  a  human  brain  were  acquired  on  a  1.5  T 
General  Electric  Signa  MRI  system. 

The  first  e.xample  is  a  simulated  MRI  sequence  which 
contains  three  overlapping  circular  regions  which  simulates 
white  matter,  gray  matter,  and  cerebrospinal  fluid  (CSF).  The 
second  and  third  examples  are  acquired  MRI  sequences  of  a 
shell-removed  hard-boiled  egg  in  gelatin,  and  a  human  brain, 
respectively.  The  fourth  example  is  acquired  MRI  sequences  of 
an  agarose  phantom  to  illustrate  partial  volume  correction  by 
the  eigenimage  filter  in  real  life.  For  the  first  three  examples, 
original  image  sequences  are  shown  in  Figs.  1,  4,  and  7, 
respectively.  Transformed  images  are  shown  in  Figs.  2-3, 
5-6,  and  8-9,  respectively.  The  original  images  in  Figs.  1, 
4.  and  7  are  windowed  (linearly  histogram  equalized)  together 
to  provide  optimal  contrast  for  the  observation  of  all  features 
present  in  the  scene.  Transformed  images  are  individually 
windowed  to  provide  optimal  contrast  for  the  segmentation 
of  our  desired  feature  (egg-white  for  egg,  white  matter  for  the 
simulation  and  brain).  For  the  agarose  phantom,  a  spin-echo 
sagittal  slice,  four  multiple  spin-echo  coronal  slices,  and  two 
coronal  eigenimages  are  shown  in  Fig.  11. 

C.  Results  for  Linear  and  TPI  Filters 

The  filters  listed  in  Section  I  were  implemented  and  tested 
on  a  Vicom  image  processing  computer.  Sample  covariance 
matrices  were  estimated  over  the  whole  image  pixels,  and 
five  PCA  images  were  generated  for  each  sequence.  For 
the  simulation,  they  contained  73.297%,  25.334%,  0.981%, 
0.196%,  and  0.192%  of  the  variance,  respectively.  Note  that 
the  variance  can  be  viewed  as  a  measure  of  the  image 
information  content  if  it  is  computed  over  a  significant  portion 
of  the  image  containing  different  tissues  [15].  For  the  egg 
phantom,  PCA  images  contained  95.364%,  4.466%,  0.135%, 
0.018%,  and  0.017%  of  the  variance,  respectively.  For  the 
brain,  they  contained  91.553%,  7.746%,  0.564%,  0.078%,  and 
0.059%  of  the  variance,  respectively. 

The  first  and  second  PCA  images  are  shown  in  Figs.  2(a), 
5(a),  8(a),  and  3(a),  6(a),  9(3),  respectively.  The  secmid  PCA 
images  provides  the  best  segmentation  of  our  desired  feature 
for  the  egg  and  brain  examples,  as  compared  to  theotiier  PCA 
images.  This  suggests  that  SDF  is  not  necessarily  achieved  in 
the  scene  with  the  largest  variance.  Note  that  the  first  PCA 
image  for  the  brain  is  quite  similar  to  the  matched  filtered 
image  for  the  white  matter,  since  this  is  the  dominant  tissue 
type  in  the  slice. 

The  matched  image,  eigenimage,  first  and  second  MMAC 
images,  and  target  point  image  for  the  desired  features  are 
shown  in  Figs.  2(b)-(f),  5(b)-(f),  and  8(b)-(f).  The  mod¬ 
ified  matched  image,  eigenimage,  third  and  fourth  maximum 
CNR  images,  and  inverted  target  point  image  for  the  de- 


Fig.  1.  (a)-(e)  Four  simulated  spin-echo  T2-weighted  (TE/TR  =  25-100/ 
2500  ms)  and  one  Tl-weighted  (TE/TR  =  25/500  ms)  MR  images,  respec¬ 
tively.  The  scene  consists  of  three  overlapping  circular  regions:  the  top  circle 
represents  CSF,  the  right  circle  represents  grey  matter,  and  the  left  circle 
represents  white  matter.  Voxels  corresponding  to  the  pixels  in  the  overlapping 
regions  are  assumed  to  contain  equal  proportions  of  the  overlapping  tissues. 
Images  are  windowed  together  to  provide  optimal  contrast  for  the  observation 
of  ail  features  in  the  scene. 


Fig.  Z  -(•>'({)  Fast  prindpal  comfionent  iinagfe.matc^  image,  eigenim- 
age,  first  and  second  MMAC  images,  and  taz^  point  image  for  the  left  circle, 
respectively.  (g)-(i)  Ratio,  log-ratio,  and  angle  images  obtained  ftom  dividing 
the  Tl-weighted  image  shown  in  Fig.  1(e)  by  the  fourth  T2-wci^ted  image 
shown  in  Fig.  fid).  Images  are  windowed  individually  to  provide  optimal 


contrast  for  the  segmentation  of  the  left  circle. 


sired  features  are  shown  in  Figs.  3(b)-(f),  6(b)-(f),  and 
9(b)-(f).  Although  the  matched  image  provides  the  maximum 


308 


IEEE  TRANSACTIONS  ON  MEDICAL  IMAGING,  VOL 


II,  NO.  3,  SEPTEMBER  1992 


Fig.  3.  (a)-(f)  Second  principal  component  image,  modified-matched  im- 
age,  eigenimage,  third  and  fourth  MMAC  images,  and  inverted  target  point 
image  for  the  left  circle,  respectively,  (g)-(i)  Ratio,  log-ratio,  and  angle 
images  obtained  from  dividing  the  Tl-weighted  image  shown  in  Fig.  1(e)  by 
the  average  of  all  Images  shown  in  Fig,  1,  Images  are  windowed  individually 
to  provide  optimal  contrast  for  the  segmentation  of  the  left  circle. 


ig.  5.  (a)-(f)  First  principal  component  image,  matched  image,  eieenim- 
age,  first  and  second  MMAC  images,  and  target  point  image  for  the  egg-white/ 
'“g-falio.  and  angle  images  obtained  from  dividing 
the  first  T2-weighted  image  shown  in  Fig,  4<a)  by  the  Tl-weighted  Image 
shown  in  Fig  4(e),  Images  are  windowed  individually  to  provide  optimal 
contrast  for  the  segmentation  of  the  egg-white. 


Fig. 

and 

resf 

the 


Fig,  4,  (a)-(e)  Four  spin-echo  T2-^eighted  (TE/TR  =  2r-2500  im)  and 

a  shell-removed 

hard-teiled  egg  m  gelatin,  respectively.  Images  are  windowed  together  to 
provide  optimal  contrast  for  the  observation  of  all  features  in  the  scene.  Note 
the  zipper  artifact  in  the  T2-weighted  images. 

possible  SrNRj,  its  segmentation  ability  is  inferior  to  that  of 
the  modified  matched  image,  TPI,  ITPI,  and  especially  the 
eigenimage.  This  is  due  to  the  presence  of  interfering  features 
in  the  matched  image.  The  other  composite  images  provide 


Fig.  6.  (a)-(0  Second  principal  component  image,,  modified-iMtched 
^e,  eigenimage,  third  and  fourth  MMAC  images,  and  inverted  target  c 
image  for  die  egg-white,  respectively.  (g)-{i)  Ratio,  log.^jo,  and  a 
imges  obtained  from  dividing  the  Tl-weighted  image  shown  in  Fig  Me 
the  average  of  all  images  shown  in  Fig.  4.  Images  are  windowed  Mvidti 
to  provide  optimal  contrast  for  the  segmentation  of  the  egg-white. 

an  inferior  SNRd  but  do  partially  remove  undesired  featu 
from  the  scene.  The  windowed  version  of  the  third  MMi 
image  shows  the  desired  feature,  whUe  other  features 
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Fig.  7.  (a)-(e)  Four  spin-echo  T2-weighied  (TE/TR  =  25-100/2500  ms) 
and  one  Tl-weighted  (TE/TR  =  25/500  ms)  MR  images  of  a  human  brain 
respectively.  Images  are  windowed  together  to  provide  optimal  contrast  for 
the  observation  of  all  features  in  the  scene. 


Fig.  9.  (a)-(f)  Second  principal  component  image,  modified-matched  im¬ 
age,  eigenimage,  third  and  fourth  MMAC  images,  and  inverted  target  point 
image  for  the  white  matter,  respectively,  (g)-(i)  Ratio,  log-ratio,  and  angle 
images  obtained  from  dividing  the  Tl-weighted  image  shown  in  Fig.  7(e)  by 
the  average  of  all  images  shown  in  Fig.  7.  Images  are  windowed  individually 
to  provide  optimal  contrast  for  the  segmentation  of  the  white  matter. 


Fig.  8.  (a) -(f)  First  principal  component  image,  matched  image,  eigenim- 
age,  first  and  second  MMAC  images,  and  target  point  image  for  the  white 
matter,  respectively,  (g)— (i)  Ratio,  log-ratio,  and  angle  images  obtained  from 
dividing  the  Tl-weighted  image  shown  in  Fig.  7(e)  by  the  fourth  T2-weighted 
image  shown  in  Fig.  7(d).  Images  are  windowed  individually  to  provide 
optimal  contrast  for  the  segmentation  of  the  white  matter. 

removed.  However,  the  partial  volume  averaging  effects  are 
not  corrected. 

Among  all  of  these  filters,  the  eigenimage  filter  removes 
interfering  features  completely,  and  also  provides  CPV.  This 


Fig.  10.  (a)  Third  T2-weighted  MR  image  of  the  egg  phantom,  (b)-(d) 
Maps  of  pixels  corresponding  to  voxels  containing  partial  volumes  of  egg-yolk 
and  egg-white,  egg-white  and  gelatin,  and  either  of  them,  respectively. 


correction  is  shown  in  the  overlapping  regions  of  the  simula¬ 
tion  in  Fig.  2(c),  between  white  and  gray  matters  in  Fig.  8(c), 
at  the  border  of  egg-white  and  egg-yolk  in  Fig.  5(c),  and  in  the 
overlapping  region  of  agarose  phantom  in  Fig.  ll(f)-(g).  A 
map  of  pixels  containing  partial  volume  averaging  effects  for 
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Fig.  11.  (a)  .A  sagiltal  view  of  an  agarose  phaniom.  Slice  position  for  the  coronal  images  in  (b)-(e)  is  also  shown.  Dark  and 

bright  regions  correspond  to  3%  and  2%  agarose  compounds,  respectively,  (b)-(e)  Four  coronal  spin-echo  T2-weighted  (TE/TR 
—  25-100/2500  ms)  MR  images  of  the  agarose  phantom,  images  in  (b)-(e)  are  windowed  together  to  provide  optimal  contrast 
lor  the  observation  of  all  features  in  the  scene,  (f)-(g)  Eigenimages  for  the  top  and  bottom  regions,  respectively.  Note  CPV 
for  the  central  region  in  both  cases. 


the  egg  phantom  is  shown  in  Fig.  10.  Estimated  and  original 
partial  volumes  for  the  simulation  and  the  agarose  phantom 
are  given  in  Table  IV.  Regarding  the  eigenimage  of  the  brain, 
it  can  be  seen  that  the  spongy  material  (choroid  plexus)  inside 
the  CSF  area  has  been  projected  into  the  eigenimage.  This  is 
called  a  projection  artifact',  a  procedure  for  correcting  these 
kinds  of  artifacts  has  been  developed  by  Windham  et  al.  [27]. 

1)  Results  for  Nonlinear  Filters 

We  have  calculated  ratio  images  by  dividing  the  second 
image  by  the  first  image  the  third  by  the  second 

the  fourth  by  the  third  the  fifth  by  the  fourth 

and  finally,  the  first  by  the  fifth  For  the  simulation  and 

the  brain  sequence,  all  images  had  some  pixel  intensities  very 
close  to  zero,  so  the  ratio  images  had  a  very  large  dynamic 
range,  and  none  of  them  could  be  displayed  to  visualize 
different  features  in  the  scene.  This  also  happened  for  the 


and  R^*  of  the  egg  phantom.  Hence,  we  have  shown  R^^, 
the  best  of  the  egg  ratio  images  (in  terms  of  segmentation  of 
the  desired  feature),  in  Fig.  5(g).  The  R^  for  the  simulation 
and  the  brain  are  shown  in  Figs.  2(g)  and  8(g).  By  taking 
logarithms  and  arctangents  of  these  ratio  images  (R°'^  for  the 
simulation  and  brain,  R^^  for  the  egg  phantom),  we  generated 
log-ratio  and  angle  images,  which  are  shown  in  Figs.  2(h)-(i) 
5(h)-(i),  and  8(h)-(i). 

For  reducing  the  propagated  noise  and  avoiding  division 
by  zero,  we  also  considered  ratio  images  formed  by  dividing 
each  image  by  the  average  image.  The  best  ratio,  log-ratio,  and 
angle  images  with  respect  to  the  average  image  are 
and  displayed  in  Figs.  3(g)-(i),  6(g)-(i),  and  9(g)-(i), 

respectively.  Note  that  the  propagated  noise  is  reduced,  i.e., 
image  SNR  and  CNR  are  improved  over  the  above  ratio 
images  (see  Tables  Ill.b-d),  and  the  dynamic  range  is  not 
excessive. 
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For  the  simulation,  the  logarithm  sufficiently  reduced  the 
dynamic  range  for  but  not  for  R^-^.  The  reason  is  that 
[here  are  a  few  pixels  in  with  intensities  very  close  to 
2ero;  their  logarithms  tend  to  minus  infinity,  resulting  in  a 
very  large  dynamic  range  for  the  filtered  image.  The  same 
problem  also  occurred  in  other  cases,  not  shown  in  this  paper 
for  conciseness.  This  suggests  that  the  logarithm  function  is 
not  adequate  for  solving  the  dynamic  range  problem.  The 
arctangent  function,  on  the  other  hand,  did  sufficiently  reduce 
the  dynamic  range  for  all  ratio  images.  There  results  are  in 
agreement  with  those  found  by  Wecksung  et  al.  [35]  for  ratio 
images  in  the  field  of  remote  sensing. 


E.  Comparison  of  Theoretical  and  Experimental  Measures 

Using  the  analytical  expressions  given  in  Tables  I  and  II,  we 
have  computed  SNR^s  and  CNR^^s  of  the  composite  images 
in  each  example.  Also  using  the  sample  mean  (2)  and  vari¬ 
ance  (3)  estimators,  we  have  experimentally  estimated  these 
quantities.  Theoretical  and  experimental  SNR^s  and  CNR^^s 
are  compared  in  Tables  Ill.b— d.  The  mathematical  predictions 
and  experimental  results  for  the  simulated  image  sequence 
are  in  close  agreement.  This  is  because  the  additive  noise 
in  simulated  images  is  white  with  a  Gaussian  distribution. 
However,  some  of  the  mathematical  predictions  differ  from  the 
experimental  results  for  the  egg  and  brain  images.  We  attribute 
this  to  the  field  and  sample  (egg  and  brain)  inhomogeneities, 
which  result  in  inhomogeneous  regions  for  the  same  material. 

Finally,  the  CPV  is  investigated  by  using  (8)  for  eigenim- 
ages  of  the  simulation  and  an  agarose  phantom;  note  that  this 
can  not  be  done  for  the  egg  phantom  and  the  brain  images  since 
the  actual  partial  volumes  in  the  voxels  containing  multiple 
tissues  are  nonconstant  and  unknown.  The  partial  volume  is 
estimated  by 

^djk  _  E[TIjk]  /gx 

r  E[T{PfJ]- 

In  the  simulation,  there  are  three  overlapping  regions  between 
the  desired  features  (assumed  to  be  the  left  circular  region)  and 
the  undesired  features  (assumed  to  be  the  right  and  top  circular 
regions).  For  these  three  regions,  estimated  partial  volumes 
(EPV)  are  compared  to  the  original  partial  volumes  (OPV) 
used  in  preparing  the  simulation.  For  the  agarose  phantom, 
three  slices  at  different  locations  are  considered  (see  Fig.  1 1), 
and  EPV  for  six  eigenimages,  generated  by  taking  one  of  the 
pure  regions  as  the  desired  feature  and  the  other  pure  region  as 
the  interfering  feature,  is  compared  to  the  corresponding  OPV 
which  is  known  from  the  slice  location.  Results  of  both  studies 
are  summarized  in  Tabel  IV.  It  is  seen  that  the  eigenimage 
filter  achieves  CPV  in  both  cases.  Small  differences  are  due 
to  the  estimation  of  the  signature  vectors  and  mean  values,  as 
well  as  sample  and  magnetic  field  inhomogeneities  and  slice 
positioning  enors  which  are  inevitable  in  actual  MRI  studies. 
This  demonstrates  the  proof  given  in  the  Appendix  that  the 
eigenimage  filter  attains  CPV. 


IV.  Discussion  of  Results 

We  compared  the  performance  of  the  eigenimage  filter  to 
that  of  four  linear  transformations,  namely,  principal  com¬ 
ponent  analysis,  matched,  modified-matched,  and  maximum 
contrast  filters,  and  four  nonlinear  transformations,  namely, 
target  point  image  method,  ratio,  log-ratio,  and  angle  image 
filters.  The  performance  comparison  was  made  by  listing  math¬ 
ematical  expressions  for  SNR'^s  and  CNR[;„s  of  composite 
images,  investigating  CPV  by  a  linear  filter,  and  comparing 
SNRd,  CXRdu.  SDF,  and  CPV  on  three  MRI  scene  sequences. 

A.  Linear  Filters 

We  found  two  difficulties  in  using  PCA  for  image  sequence 
analysis.  First,  PCA  is  size  dependent— it  enhances  large 
objects  better  than  small  objects,  since  small  objects  make 
less  contribution  to  the  estimation  of  the  covariance  matrix 
than  large  objects.  Another  way  of  looking  at  this  is  that  small 
objects  rarely  affect  GSNR;  there  is  no  “motivation”  for  PCA 
to  enhance  them.  Therefore,  small  objects  (abnormalities)  are 
not  usually  visualized  in  the  first  PCA  image.  This  suggests 
that  for  image  analysis  and  interpretation,  all  of  the  PCA 
images  should  be  used.  For  instance,  although  the  last  PCA 
image  contains  a  small  portion  of  the  total  variance,  it  may 
visualize  features  hidden  in  the  original  images;  using  PCA 
for  MR  image  compression  risks  losing  these  hidden  features. 
Second,  in  general,  there  is  no  guarantee  that  a  desired  feature 
will  be  enhanced  and  undesired  features  suppressed  in  a  PCA 
image.  On  the  other  hand,  an  advantage  of  PCA  is  that  there 
is  no  need  to  estimate  signature  vectors.  Hence,  PCA  may  be 
used  even  though  the  desired  feature  (e.g.,  an  abnormality)  is 
not  detectable  in  the  original  images. 

The  matched  filter  maximizes  the  SNRj,  but  since  it  does 
not  remove  any  of  the  interfering  features  from  the  scene,  it 
is  inappropriate  for  image  segmentation.  Its  modified  version 
removes  any  constant  (bias)  features,  at  the  expense  of  a 
decrease  in  SNRj.  The  modified-matched  filter  is  a  special 
case  of  the  eigenimage  filter  (with  the  constant  vector  as 
the  undesired  signature  vector).  Since  only  changing  features 
are  left  in  a  modified-matched  image,  there  is  a  chance  for 
segmentation  of  the  feature  of  interest. 

In  many  applications,  however,  there  are  nonconstant  fea¬ 
tures  which  interfere  with  the  observation  of  the  desired  object. 
The  eigenimage  filter  is  the  optimal  linear  filter  for  removing 
any  interfering  feature  at  the  lowest  possible  cost  (smallest 
decrease  in  SNRj).  As  seen  from  Table  I,  the  SNRj  and 
CNRj„s  in  the  eigenimage  depend  on:  1)  the  SNR  of  the 
original  images,  and  2)  the  angle  between  the  desired  signature 
vector  and  its  projection  onto  the  undesired  subspace.  To 
improve  the  SNRj  and  CNR^^s  of  the  eigenimage,  one  can 
improve  the  quality  of  the  original  images  [54]  and/or  optimize 
MRI  protocols  and  pulse  sequence  parameters  to  increase 
this  angle  [53],  The  important  points,  however,  are  that  the 
eigenimage  filter  always  segments  the  desired  feature,  and  also 
corrects  for  partial  volume  averaging  effects. 

The  maximum  contrast  filter  was  originally  developed  for 
maximizing  CNR  between  a  desired  and  one  undesired  feature. 
There  is  no  unique  approach  to  extend  this  filter  to  the  case  of 
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TABLE  III  (a) 

Experimental  SXR^,  CN'R^u,.  and  CNR^u^  for  the  Original  Images 


Study 

Image  Name 

SXRi 

CXR,„, 

CNRduj 

Simulation 

First  T2 

47.993 

-16.542 

-4.809 

Second  T2 

27.734 

-14.389 

-19.215 

Third  T2 

16.003 

-9.599 

-22.409 

Fourth  T2 

10.663 

-6.417 

-24.535 

T1 

56.525 

1.063 

30.383 

Egg 

First  T2 

37.387 

-5.805 

15.338 

Phantom 

Second  T2 

17.507 

-10.166 

12.044 

Third  T2 

10.959 

-12.444 

9.446 

Fourth  T2 

5.792 

-13.342 

3.778 

n 

33.999 

1.509 

-5.139 

Brain 

First  T2 

21.330 

-1.526 

-0.751 

Second  T2 

14.030 

-1.532 

-5.131 

Third  T2 

10.892 

-1.208 

-8.415 

Fourth  T2 

8.258 

-0.909 

-10.667 

T\ 

25.893 

1.640 

12.485 

TABLE  III  (b) 

Comparison  of  Mathematical  and  Experimental  SNRj,  CNR<iu, ,  and  CNR^u^  for  Linear 
AND  Nonlinear  Filters  Using  a  Simulated  MRl  Seoue.nce  with  Three  Features  in  the  Scene 


Filter 

SNRi 

Math. 

SNR,i 

Exp. 

CNRdu, 

Math. 

CNR,„, 

Exp. 

CNRiuj 

Math. 

CNR,i„j 

Exp. 

PCAi 

-14.190 

-14.290 

-13.436 

-13.511 

-48.939 

^.743 

Matched 

81.470 

81.099 

-16.631 

-16.623 

4.093 

4.095 

Mod -Mat 

39.830 

39.827 

2.597 

2.604 

40.759 

40.477 

Eigen 

8.128 

8.079 

8.128 

8.064 

8.128 

8.125 

MMAC, 

-54.629 

-54.709 

24.802 

24.847 

30.678 

30.623 

MMAC: 

6.780 

6.834 

15.468 

15.557 

49.190 

49.024 

MMACa 

-61.897 

-61.814 

24.517 

24.535 

24.517 

24.456 

MMAC4 

74.070 

73.367 

-14.128 

-14.107 

14.128 

14.025 

TPI 

3.094 

3.093 

-26.421 

-26.607 

-54.840 

-54.405 

ITPI 

2.370 

2.467 

3.097 

3.225 

3.223 

3.356 

10.478 

10.325 

5.343 

5.307 

12.711 

12.541 

LR^‘^ 

17.477 

17.582 

6.112 

6.133 

26.057 

26.092 

79.630 

80.928 

6.488 

6.506 

36.765 

36.573 

58.893 

59.192 

16.020 

16.082 

43.030 

42.018 

LR^'' 

33.907 

34.078 

16.344 

16.397 

37.455 

36.540 

145.926 

146.673 

16.418 

16.467 

39.%7 

38.694 

multiple  interfering  features.  We  considered  three  possibilities 
for  the  case  of  multiple  interfering  features:  maximized  min¬ 
imum  absolute  CNR;  target  point;  and  inverted  target  point 
image  methods.  The  best  MMAC  image  may  achieve  higher 
CNR  than  the  eigenimage,  TPI,  and  ITPI.  This  is  achieved  at 
the  expense  of  leaving  interfering  features  in  the  scene  and 
not  correcting  for  partial  volume  averaging  effects. 

B.  Nonlinear  Filters 

We  considered  ratio,  log-ratio,  and  angle  images.  We  also 
discussed  the  possibility  of  dividing  each  image  by  the  average 
image,  and  then  applying  logarithm  and  arctangent  functions. 
We  showed  that  this  choice  reduces  the  propagated  noise  to  the 
transformed  images,  since  noise  is  suppressed  in  the  average 
image. 


We  noted  that  in  the  calculation  of  PCA,  ratio,  log-ratio, 
and  angle  images,  there  is  no  need  to  define  signature  vec¬ 
tors.  Therefore,  these  transformations  can  be  classified  as 
unsupervised  methods.  In  contrast,  signature  vectors  must  be 
defined  for  matched,  modified  matched,  eigenimage,  maYimnm 
contrast,  and  target  point  filters.  Hence,  they  can  be  classified 
as  supervised  methotte. 

C.  Conclusions 

From  both  mathematical  and  experimental  results  (Tables  I, 
m.b-d)  it  is  seen  that  among  all  linear  filters,  the  matched  filter 
achieves  the  maximum  value  for  SNR<j  and  the  Maximum 
CNR  filter  (MMAC)  gives  the  maximum  value  for  CNRd^. 
However,  they  do  not  necessarily  segment  the  desired  fea¬ 
ture,  or  correct  for  partial  volume  averaging  effects.  Among 
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TABLE  III  (c) 


Comparison  of  MATHEMAnCAL  and  Experimental  SXR^.  CXRj„i  ,  and  CN'Rjuj  for  Lin^  and 
Nonlinear  Filters  Using  an  Egg  Phantom  MRI  Seolence  with  Three  Features  in  the  Scene 


Filter 

SXR.^ 

Math. 

SNRrf 

Exp. 

CNR.<„, 

Math. 

CNRrfu, 

Exp. 

CNRdu2 

Math, 

CNRduoExp. 

PCA, 

52.32 

34.13 

-17.33 

-9.61 

16.14 

10.83 

Matched 

57.86 

41.55 

-8.13 

-4.53 

7.87 

5.14 

Mod -Mat 

32.21 

25.18 

14.'2 

12.88 

-11.23 

-8.65 

Eigen 

9.16 

16.08 

5  96 

13.94 

10.72 

17.97 

MMAC, 

-19.02 

-11.99 

24,74 

18.55 

-22.40 

-16.15 

MMACi 

16.18 

11.43 

-19.68 

-18.37 

28.16 

22.21 

MMAC3 

-3.04 

-4.67 

6.11 

9.64 

11.00 

15.45 

MMAC4 

12.99 

10.75 

-15.12 

-17.38 

27,20 

24.25 

TPl 

3.09 

2.15 

-26.35 

-19.46 

-30.34 

-27.51 

ITPl 

2.37 

2.19 

3.10 

2.94 

3.13 

2.85 

r15 

23.01 

28.39 

-5.64 

-7.68 

16.59 

21.54 

-8.83 

-11.00 

-5.70 

-8.73 

15.51 

26,06 

.TG'" 

29.58 

36.75 

-5.80 

-8.70 

17.33 

24,43 

RSa 

38.80 

30.03 

13.19 

15.18 

-19.12 

-24.52 

LR=‘' 

28.44 

21.62 

14.20 

14.40 

-21.11 

-20.62 

.TG’" 

111,61 

83.88 

14.40 

13.97 

-21.36 

-17,67 

TABLE  III  (d) 

Comparison  of  Mathematical  and  Experimental  SNRj,  CNRdui .  ano  CNRduj  for  Linear 
and  Nonunear  Filters  Using  a  Brain  MRI  Sequence  with  Th^e  Features  in  the  Scene 


Filter 

SNRd 

Math. 

SXRd 

Exp. 

CNRdui 

Math. 

CNRdui 

Exp. 

CNRdu2 

Math. 

CNRduj 

Exp. 

PCAi 

39.48 

38.48 

-0.96 

-0.35 

1.87 

1.82 

Matched 

39.48 

38.43 

-0.97 

-0.36 

1.76 

1,73 

Mod -Mat 

15.53 

25.78 

1.78 

2.60 

20.85 

28.98 

Eigen 

3.44 

5.13 

2.51 

4.61 

4.03 

5.08 

MMACi 

-8.44 

-9.40 

4.54 

5.36 

21.86 

23.05 

MMAC2 

2.71 

4.49 

3.87 

4,22 

25.64 

30.76 

MMAC3 

-19.67 

-25.05 

3.07 

5.06 

4.94 

5.25 

MMACa 

20.75 

29.41 

-1.90 

-2.26 

3.05 

3.95 

TPI 

3.09 

3.00 

-2.81 

-3.80 

-27.40 

-31.67 

ITPI 

2.37 

2,47 

1.92 

2.86 

3.11 

3.26 

7.59 

10,00 

1.77 

4.15 

10.42 

11.99 

9.30 

13.28 

1.87 

3.24 

17.93 

20.87 

AG=-^ 

36.06 

53.05 

1.89 

2.61 

22.94 

26.43 

RSa 

29.18 

33.82 

4.02 

4.55 

24.05 

25.03 

LR^^ 

11.99 

14.64 

4.00 

3.54 

19.04 

20.47 

62.42 

74,02 

3.99 

3.48 

21.86 

23.44 

nonlinear  filters,  the  angle  image  calculated  with  respect  to 
the  average  image  provided  the  maximum  value  for  SNRa, 
and  the  target  point  image  achieved  the  maximum  absolute 
value  for  CNILtu-  The  target  point  image  always  segments 
the  desired  feature;  however,  the  desired  feature  is  actually 
removed  from  the  scene  (it  appears  black).  The  inverted  target 
point  image  overcomes  this  difficulty,  but  suffers  from  low 
SNRd  and  CNRd„.  The  angle  image  may  or  may  not  segment 
the  desired  feature;  none  of  these  generally  corrects  for  partial 
volume  averaging  effects. 

Of  the  transformations  discussed  in  this  paper,  the  eigen- 
image  filter  is  the  optimum  linear  filter  which  segments  the 


desired  feature,  removes  interfering  features,  and  corrects  for 
partial  volume  averaging  effects.  For  situations  in  which  SNRd 
and  CNRd„s  of  the  eigenimages  are  unsatisfactory,  one  can 
improve  these  quantities  by  improving  the  quality  of  the 
original  images  [54]  and/or  optimizing  MRI  pulse  sequence 
and  parameters  [53]. 

V.  Appendix 
A.  Details  of  Linear  Filters 

1)  Principal  Component  Analysis  (PCA):  The  weighting 
vectors  for  PCA  are  the  normalized  eigenvectors  of  the  n  x  n 
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TABLE  IV 

Correction  of  Partial  Volume  Averaging  Effects  by  Eigenimage  Filter  for  MRi  Sequences 
OF  A  Simulation  and  an  Agarose  Prantom  with  Overlapping  Featl^res  in  the  Scene 


Simulation 

Agarose  Phantom 

EPV 

0.4993 

0.4775 

0.3531 

0.5260 

0.4847 

0.6881  0.3384 

0.3715 

0.6043 

OPV 

0.5000 

0.5000 

0.3333 

0.5250 

0.4750 

0.6750  0.3250 

0.3750 

0.6250 

Estimated  Partial  Volumes  (EPV)  are  obtained  using  the  sample  mean  estimator  for  the  overiappine  reeions  of  the 
simulation  and  the  agarose  phantom. 


Original  Partial  Volumes  (OPV)  are  values  which  have  been  used  in  preparing  the  simulation  and  the  agarose 
phantom,  respeciivelv. 


sample  covariance  matrix  A',  estimated  as 

=  gj  ^  {Pjki  -  P){Pjki  -  Pi)-  t.l  =  l.  -.n. 
j.keROi 

(10) 


4)  Wi  =  (iVi  .  W^  +  Wi  ■  W.'jWo  -  [W.  ■  W.^Wi 
■  WojWi- 

For  a  scene  with  more  than  two  interfering  (undesired) 
features,  the  candidate  weighting  vectors  are  given  in  [55], 


Here  Pjki  is  the  intensity  of  (j,  ^)th  pixel  in  the  zth  image, 
P,  denotes  the  tth  image  average  gray  level  in  the  region  of 
interest  (ROI)  over  which  the  covariance  matrix  is  estimated, 
and  M  is  the  total  number  of  pixels  the  ROI.  This  ROI  may 
be  a  small  region,  or  the  set  of  all  pixels  whose  intensities  are 
above  some  specific  threshold,  or  even  the  whole  image.  If  the 
ROI  is  the  entire  image,  the  transformation  will  be  driven  by 
overall  properties  of  the  tissues  within  the  slice.  The  order  of 
the  PCA  images  is  given  by  the  quantitative  distribution  of  the 
corresponding  eigenvalues  of  the  sample  covariance  matrix. 

2)  Matched  Filter:  The  weighting  vector  for  the  matched 
filter  is 

W  =  d  (11) 

where  d  is  the  desired  signature  vector. 

3}  Modified-Matched  Filter:  The  weighting  vector  for  the 
modified  matched  fiber  is 

<12) 

where  c  =  [1.  1.  ■  •  • ,  1]^. 

4)  Eigenimage  Filter:  The  weighting  vector  for  the  eigen¬ 
image  filter  is 


B.  Details  of  Nonlinear  Filters 

1)  Target  Point  Image  Method:  The  target  point  image  (TPI) 
is  defined  as 


n 


(15) 


Here  and  in  the  sequel,  subscript  jk  is  used  to  indicate  (j.  A:)th 
pixel  of  the  corresponding  image.  The  inverted  target  point 
image  (ITPI)  is  defined  as 


ITPLi  = 


(16) 


2)  Ratio  Filter:  The  ratio  image  between  the  /th  and  mth 
images  (i?'*")  is  defined  as 


R 


Im 


Pjki 

Pjkm 


(17) 


where  Pjki  and  Pjkm  are  respective  pixels  in  the  /th  and  mth 
images;  clearly  the  mth  image  must  be  nonzero  everywhere. 

3)  Log-Ratio  Filter:  The  log-ratio  image  between  the  /th 
and  mth  images  is  defined  as 


W  =  d  -  S’  (13) 

where  S  is  the  projection  of  d  onto  the  subspace  spanned  by 
{Sk.  k  =  1.  ■  ■  • .  m.  k  d}  (note  that  Sj  =  d).  We  refer 
to  this  subspace  as  the  undesired  subspace.  The  weighting 
vector  is  computed  using  a  Gram— Schmidt  orthogonalization 
procedure. 

5)  Maximum  Contrast  Filter:  For  a  scene  with  one  interfer¬ 
ing  (undesired)  feature,  the  weighting  vector  is 

W  =  d  -  i.  (14) 

For  a  scene  with  two  interfering  (undesired)  features,  the 
candidate  weighting  vectors  are 

1)  =  5-«i; 

2) W2  =  d-  «2; 

3)  Ws  =  (Wi  ■  Wi-Wi  ■  W2)W2  +  iW2  ■  W2-W1 

■W2)Wi- 


LR^gf  =  ln[R'^]  =  ln[P,fc,]  -  InjP.t^].  (18) 

4)  Angle  Image  Filter:  The  angle  image  between  the  /th  and 
mth  images  {AG‘^)  is  defined  as 

AG'T  =  arctan[R'^‘].  (19) 

5)  Use  of  Average  Image:  It  is  also  possible  to  generate 
ratio,  log-ratio,  and  angle  images  with  respect  to  an  average 
image  (A),  defined  as 

1  ” 

=  (20) 

i=l 

When  using  the  ratio  of  the  /th  image  in  the  sequence  to 
the  average  image,  the  resulting  images  are  called  ratio  over 
average  (/?'“),  log-ratio  over  average  (LR'“),  and  angle 
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over  average  (.4G'“).  respectively.  This  choice  reduces  the 
propagated  noise  to  the  transformed  images,  because  noise  is 
decreased  in  the  average  image  (see  Tables  II  and  III). 


C.  Rationale  for  the  Application  of  Ratio  Filter  to  MRI 

Considering  random  distributions  for  tissue  parameters  [56], 
the  signal  in^he  /th  TZ-weighted  multiple  spin-echo  image 
;S, !  can  be  represented  and  simplified  as  follows: 

.  r  /  -Ti? 

5  —  A'l  //  '  —  T  V '  1 1  -  exp  — — 

'  ■  •  L 

-iTE 


■  exp 


T'2  -t-  Tj-o 


=  \y(H)  -t-  cr.vj 


/  -TR 


exp 


iTE 

T2 


1  A- 

i  -r  j-2 


~  \y{H)  +  tj.v] 


1  -  expl 


(  -TR 


\Tl  +  ctti 


(  iTE  f 

.r  /  -TR 


1  -  exp 


=  [N{H)  +  cr.v 

fiTE  ,  , 

•  expl  '^^2  I  exp(  - 

iTE 
T2 


T1  -f  CTjl 
iTE 
T2 


=  t7iT?^exp  - 


(21) 


Here  p:  and  %  include  the  total  uncertainty  from  the  tissue 
parameter  distributions.  From  (21)  it  is  inferred  that  in  a  ratio 
image,  most  of  the  signal  variations  within  a  specific  tissue 
are  compensated.  This  can  improve  the  SNR  of  the  image 
(see  Tables  Ill.a-d).  Proton  density  contribution  to  the  signal 
is  also  suppressed.  This  may  be  considered  as  a  disadvantage. 
However,  in  a  situation  where  the  overall  signal  is  dominated 
by  the  proton  density  contribution  (i.e.,  the  proton  density 
contribution  dominates  the  relaxation  times  contributions  to 
the  overall  signal),  this  cancellation  can  result  in  visualization 
of  hidden  features. 


D.  Methods  of  Deriving  SNR  and  CNR  Expressions 

I )  Standard  Formula  for  Noise  Propagation:  For  a  random 
variable  c  which  is  a  smooth  function  of  uncorrelated  random 
variables  xi.  xo,  ■  ■  ■ 

Z  =  /(xi,  X2,  •  •  •  ,  Xm)  (22) 


we  have  [50],  [51] 


var(2 


cr? 


i=\ 

Z  —  f  (xi ,  X2 , 


\df 

[d.. 

Z 

Xm) 


(23) 

(24) 


where  var(2)  =,  cr^,  and  z  are  the  variance,  standard  deviation, 
and  mean  of  the  random  variable  2,  respectively.  Similarly, 
Xi, ,  and  X,  are  the  standard  deviation  and  the  mean  of  random 
variable  x,. 


2)  Chi-Squared  Distribution  (SNR  Calculation  for  TPI  and 
ITPI):  From  the  assumed  i.i.d.  (independent  an  identically 
distributed)  Gaussian  model  for  MRI  noise,  it  follows  that 
the  squared  pixel  gray  levels  in  the  desired  (target)  ROI 
of  a  TPI  are  scaled  (multiplied  by  x')  i.i.d.  chi-squared 
random  variables  with  n  degrees  of  freedom.  The  chi-^.quared 
probability  density  function  is  [57] 


f{x:n) 


1 - X  -i  G  for  .r  >  0 

2nrr(T-i 

0  for,r<() 


where  n  is  the  degrees  of  freedom;  E[x]  =  n  and  Vari.ri  = 
2n.  For  a  random  variable  y  =  a-x.  E[y]  =  net-  and 
Var(t/)  =  2nx’‘. 

For  TPI  we  need  to  calculate  the  mean  and  variance  of  the 
gray  levels  inside  the  DROl  of  the  TPI  represented  by  random 
variable  z  =  sjy  =  =  oy/x.  In  [52]  we  showed  that 


£[2]  = 


cjs/2Y[^) 

r(y) 


Var(2)  =  —  2x^ 


0 


Using  (26)  and  (27),  (1)  simplifies  to 


SNRd 


(26) 

(27) 


(28) 


For  ITPI  we  need  to  calculate  the  mean  and  variance  of 
random  variable  w  =  I/2.  We  showed  in  [52]  that 


r(^) 

(29) 

aV2r{f) 

1 

(  r(^)  V 

(30) 

x2(n  -  2) 

Plugging  (29)  and  (30)  in  (1)  and  doing  simplifications  yield 


SNRd  = 


1 


(31) 


E.  Correction  for  Partial  Volume  Averaging  Effects  by 
Eigenimage  Filter 

We  define  a  vector  space  whose  dimension,  n,  is  equal 
to  the  number  of  images  in  the  sequence.  In  this  vector 
space,  a  pixel  vector  P jk  —  ■  ■fjfen]  is  an  n- 

dimensional  vector  whose  elements  are  the  gray  levels  of 
the  (j,  fc)th  pixels  of  the  images  in  the  sequence.  Signature 
vectors  {Si  =  [SnSa  ■  ■  ■  ■Sin]^)  ^  =  T  ■  ■  t  are  also  n- 
dimensional  vectors  whose  (th  elements  specify  the  desired 
and  undesired  features  in  the  hh  image,  respectively.  For  a 
linear  transformation,  the  transformed  image  (Tl)  is  a  weighted 
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sum  (linear  combination)  of  original  images  in  the  sequence, 
i.e., 

n 

TIj,=  Y.\\\Pjk.  =  W  ■  P,,  (32) 

where  W  -  I'lT;  U'l  •  ■  ■  ir„j  ^  is  the  weighting  vector  to  be 
determined.  Writing  (7)  in  vector  notation  gives 

1=1  ^  ^ 

Therefore,  we  have 

TI,,  =  W  ■  =  ■  Si  +W  ■  rjjk  (34) 

and  taking  expectations  (assuming  a  priori  known  signature 
vectors)  yields 

£[r/,,]  =  f;  ■  Si +  W  ■  E[Tj,k].  (35) 

From  (8),  correction  for  partial  volume  averaging  effects 
(CPV)  requires 

E[TI,,]  =  (^^'^E[T{pL)]  =  53.  (36) 

where  second  equality  follows  from  E\W -rijk]  =  ^-Elrijk]  = 
0,  and 

E[7{PL)]  =e[w  ■  P^=W  ■  E[pf^]  =W  ■  S-a 

(37) 

where  Pf^  represents  a  pixel  vector  in  the  DROI.  Subtracting 
(36)  from  (35)  we  obtain 


Since  (38)  must  hold  for  arbitrary  values  of  Vijk,  we  can 
prove  the  following. 

Claim:  CPV  is  achieved  if  and  only  if 

W  ■  Si  =0.  1  =  1,  Ij^d.  (39) 

Proof: 

If:  If  (39)  holds,  then  (38)  holds,  and  subtracting  this  from 
(35)  shows  that  (36)  holds  and  CPV  is  attained. 

Only  If:  Let  Vgjk  =  V  and  Vijk  =  0  for  all  (  such  that  I  ^  q 
and  I  7^  d.  Then  (38)  requires  W  ■  Sq  =  0.  This  argument 
can  be  made  all  for  integers?  (1  <  ?  <  m  and  q  ^  d),  hence 
(39)  holds. 

Consequently,  only  the  eigenimage  filter  among  all  linear 
filters  achieves  CPV.  Moreover,  as  we  have  shown  in  [58], 
the  filter  is  optimal  in  the  sense  that  it  achieves  the  maximum 
possible  SNRd  while  correcting  for  partial  volume  averaging 
effects. 


VI.  NOMENCLATL’RE 


List  of  Abbreviations 

MRl:  Magnetic  Resonance  Imaging 

ROI:  Region  of  Interest 

DROI;  Desired  Feature  ROI 

UROl:  Undesired  Feature  ROI 

SNR;  Signal-to-Noise  Ratio 

GSNR;  Global  SNR 

CNR;  Contrast-to-Noise  Ratio 

SDF;  Segmentation  of  the  Desired  Featur 

CPV;  Correction  for  Partial  Volume  Averaging  Effects 

EPV;  Estimated  Partial  Volume 

OPV;  Original  Partial  Volume 

TI;  Transformed  (composite)  Image 

PCA;  Principal  Component  Analysis 

Mod-Mat;  Modified  Matched 

MMAC:  Maximized  Minimum  Absolute  CNR 

TPI;  Target  Point  Image 

ITPI;  Inverted  Target  Point  Image 

CSF;  Cerebrospinal  Fluid 

List  of  Notations 

n:  The  number  of  images  in  the  original  MRI  scene 

sequence 

m:  The  number  of  features  in  the  scene 

PtE  The  gray  level  of  the  (j,  A:)th  pixel  in  an  original 
image 

Pq,^  :  The  gray  level  of  the  (j,  fc)th  pixel'in  the  DROI 

M:  The  number  of  pixels  in  the  DROI 

The  gray  level  of  the  (j,  A:)th  pixel  in  the  UROI 
Tijk'.  The  zero-mean  white  noise  in  the  image 

p:  The  standard  deviation  of  white  noise 

T(  ):  The  transformation  applied  to  an  MRI  scene 

sequence 

TIj*:  The  gray  level  of  the  (j,  fc)th  pixel  in  a  trans¬ 

formed  image 

PCA, :  The  ith  composite  image  generated  by  tyhe  PCA 

transformation 

MMACii  The  zth  composite  image  generated  by  tyhe 
MMAC  transformation 

TPIjfc:  The  gray  level  of  the  (j,  A:)th  pixel  in  a  TPI 

ITPI,*:  The  gray  level  of  the  (j,  A:)th  pixel  in  an  ITPI 

R‘’^:  The  ratio  image  generated  by  dividing  the  ith 

image  by  the  mth  image 
M"*  :  The  gray  level  of  the  0,  ^)th  in  a  i?*"* 

i?‘“:  The  ratio  image  generated  by  dividing  the  Ith 

image  by  the  average  of  all  original  images 
The  gray  level  of  the  (j,  A:)th  pixel  in  a  i?'“ 
Li?'"':  The  log-ratio  image  generated  by  taking  the  nat¬ 

ural  logarithm  of  i?'"* 

LR‘^:  The  gray  level  of  the  (j,  A:)th  pixel  in  a  Li?'"* 

Li?'“:  The  log-ratio  image  generated  by  taking  the  nat- 
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nla  . 
^jk- 

ag‘^ 


AG';;;-. 

.4G'“: 


AG%-. 

E'\. 

E[-Y 

\'nr[  ■): 
rar( ): 
SNRd 
CGRdu- 

Pjk-- 


W: 

Si: 

Si^d: 

V.: 

5.: 

Vljk'- 

ydjk'- 

V: 


ural  logarithm  of  R‘‘" 

The  gray  level  of  the  (j,  k)th  pixel  in  an 
The  ansle  imase  generated  by  taking  the  arctan¬ 
gent  of  R'^ 

The  gray  level  of  the  [j.  k)ih  pixel  in  an  AG 
The  angle  image  generated  by  taking  the  arctan¬ 
gent  of  R''' 

The  gray  level  of  the  {j.k)lh  pixel  in  an  LR'" 

The  expected  value  operator 

An  expected  value  estimator  (sample  mean) 

The  variance  operator 
A  variance  estimator  (sample  variance) 

SNR  of  the  desired  feature 
CNR  between  the  desired  and  undesired  features 
An  n-dimensional  vector  (pixel  vector)  whose 
elements  are  the  gray  levels  of  the  (j,  A:)th  pixels 
of  the  original  images 

The  weighting  vector  for  a  linear  transformation 
(filter) 

The  signature  vector  for  the  /th  feature 

The  desired  feature  signature  vector 

The  zth  undesired  feature  signature  vector 

The  volume  of  the  zth  material  in  a  voxel 

The  MRI  signal  from  the  ith  material 

The  partial  volume  of  the  /th  material  in  the 

(j.  k)th  voxel 

The  partial  volume  of  the  desired  feature  in  the 
(j, /c)th  voxel 

The  total  volume  of  a  voxel 
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Abstract 

Segmentation  of  a  feature  of  interest  while  correcting  for 
partial  volume  averaging  effects  is  a  major  tool  for  identi¬ 
fication  of  hidden  abnormalities,  fast  and  accurate  volume 
calculation,  and  three-dimensional  visualization  in  the  field 
of  magnetic  resonance  imaging  (MRI).  We  present  the  op¬ 
timal  transformation  for  simultaneous  segmentation  of  a 
desired  feature  and  correction  of  pMtial  volume  averaging 
effects,  while  maximizing  the  signal-to-noise  ratio  (SNR)  of 
the  desired  feature.  It  is  proved  that  correction  of  partial 
volume  averaging  effects  requires  the  removal  of  the  inter¬ 
fering  features  from  the  scene.  It  is  also  proved  that  cor¬ 
rection  of  pzurtial  volume  averaging  effects  can  be  achieved 
merely  by  a  linear  transformation.  It  is  finally  shown  that 
the  optimal  transformation  matrix  is  easily  obtained  using 
the  Gram-Schmidt  orthogonedization  procedure,  which  is 
numerically  stable.  Applications  of  the  technique  to  MRI 
simulation,  phantom,  and  brtun  images  are  shown.  We 
show  that  in  all  Ccises  the  desired  feature  is  segmented  from 
the  interfering  features  and  partid  volume  information  is 
visualized  in  the  resulting  treinsformed  inuiges. 

I.  INTRODUCTION 

Segmentation  of  a  feature  of  interest  while  correcting  for 
partial  volume  averaging  effects  is  a  major  tool  for  image 
analysis  and  interpretation  in  the  field  of  magnetic  reso¬ 
nance  imaging  (MRI).  Its  applications  include  identifica¬ 
tion  of  hidden  abnormalities  [1],  f^t  and  accurate  volume 
calculation  [2],  and  three-dimensional  visualization  [3].  In 
an  MRI  sequence  consisting  of  several  images  of  the  same 
anatomical  site,  the  image  gray  levels  corresponding  to  dif¬ 
ferent  tissue  types,  which  are  functions  of  intrinsic  tissue 
parameters  as  well  as  pulse  sequence  p^ffametcrs  [4],  [5], 
change  characteristically  throughout  the  image  sequence 
and  contain  information  pertaining  to  partial  volume  av¬ 
eraging  effects.  This  makes  it  possible  to  generate  a  set  of 
transformed  images  in  which  the  partial  volumes  of  each 
feature  are  visualized. 


We  derive  the  optimal  transformation  for  correcting  par¬ 
tial  volume  averaging  effects.  Optimality  is  defined  as  max¬ 
imizing  the  signal-to-noise  ratio  (SNR)  of  the  desired  fea¬ 
ture,  i.e.,  the  feature  whose  partial  volumes  are  corrected 
and  visualized.  In  Section  II,  we  first  define  notations,  cor¬ 
rection  of  partial  volume  averaging  effects,  SNR,  and  the 
optimal  transformation.  We  then  establish  the  relationship 
between  correction  of -partial  volume  averaging  effects  and 
removal  of  the  interfering  features,  where  we  prove  that 
correction  of  partial  volume  averaging  effects  requires  re¬ 
moval  of  the  interfering  features.  We  use  this  relationship 
to  show  that  correction  of  partial  volume  averaging  effects 
can  be  cichieved  merely  by  a  linear  transformation.  Finally, 
we  show  that  the  optimal  transformation  matrix  can  be 
obtained  using  the  Gram-Schmidt  orthogonalization  pro¬ 
cedure.  In  Section  III,  we  present  advantages  of  the  new 
approach  to  the  previous  one  for  deriving  the  eigenimage 
filter  [6].  We  also  present  applications  of  the  technique  to 
simulation,  phantom,  and  brain  images.  Conclusions  are 
given  in  Section  IV.  This  paper  is  an  extension  of  the  pa¬ 
per  presented  at  the  IEEE  Medical  Imaging  Conference  in 
conjunction  with  the  Nuclear  Science  Symposium  [7]. 

II.  METHODS 

A.  Problem  Formulation 

1)  Notations:  Let  V  and  W  be  n-dimensional  and 
m-dimensional  real  vector  spaces,  respectively.  Then 
points  in  V  and  W  are  vectors  in  TV*  and  TV*,  respec¬ 
tively.  Let  n  be  the  number  of  images  in  the  sequence  and 
m  be  the  number  of  transformed  images  which  equals  the 
number  of  features  (objects)  in  the  scene.  Then  a  pixel 
vector  Pjk  =  [Pjki  P}k2  •  ■  ■  PjknV  is  an  n-dimensional 
vector  whose  elements  eire  the  gray  levels  of  the  [j,k)-t]i 
pixels  in  the  images  in  the  sequence.  The  n-dimensional 
vectors  {sj  =  [sji  sjj  •  •  •  3jn]^i  ^  =  E  •  •  •  >  whose  i-th  el¬ 
ement  defines  a  specific  feature  in  the  t-th  image  are  called 
signature  vectors,  which  are  assumed  to  be  linearly  inde¬ 
pendent.  This  is  a  reasonable  assumption  since  MRI  gray 
levels  in  each  image  are  distinct  non-linear  functions  of 
several  tissue  parameters  including  proton  density  (N(H)), 
spin-lattice  (Tl)  and  spin-spin  (T2)  relaxation  times,  flow 
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velocity  (i^),  and  chemical  shift  (5),  which  are  different  for 
different  tissues,  as  well  as  several  pulse  sequence  parame¬ 
ters  including  repetition  time  (TR),  echo  time  (TE),  inver¬ 
sion  time  (TI),  and  pulse  flip  angle  [6),  which  are  different 
for  different  images  in  the  sequence,  [4],  [5],  making  linear 
dependency  of  signature  vectors  very  unlikely  as  long  as 
n  >  m.  The  vector  Sd  {d  e  {l,2,---,m})  represents  the 
desired  feature,  and  each  of  the  other  signature  vectors 
represents  an  interfering  (undesired^)  feature.  A  pixel  in 
a  transformed  image  {TI)  is  a  function  of  all  pixels  at  the 
same  location  in  the  original  images,  i.e., 

TIjkd  =  TdCPjk)  (1) 

where  TIjkd  is  the  gray  level  of  the  (j,  i)-th  pixel  in  the 
d-th  transformed  image  and  Td  is  the  ftinction  to  be  found. 
For  a  linear  transformation,  7d(Pjjt)  =  td-Pj*,  where  td  = 
[tdi  id2  tdnY  is  the  d-th  transformation  (weighting) 
vector  to  be  determined. 

2)  Correction  for  Partial  Volume  Averaging  Effects: 
The  MR  signal  5  ftom  a  voxel  containing  m  different  ma¬ 
terials  is  given  by  [8] 

(2) 

1=1 

where  Vi  is  the  volume  of  the  /-th  material  within  the  voxel, 
V  is  the  total  volume  of  the  voxel,  and  5/  is  the  signal  from 
the  /-th  material.  The  gray  level  Pjk  of  the  {j,  ifc)-th  pixel 
(corresponding  to  the  (;,  A)-th  voxel)  in  an  MR  image  is 
given  by 


corresponding  voxels.  Mathematically,  this  may  be  trans¬ 
lated  to  generating  a  transformed  image  in  which 

=  (^)£K{Pf„))  (4) 

where  ElTIjkd]  is  the  mean  value  of  the  (;, /(:)-th  pixel  in 
the  transformed  image,  Vdjk  is  the  partial  volume  of  the 
desired  material  in  the  {j,k)-th  voxel,  and  £'[Td(Pf^)]  is 
the  mean  value  of  the  (/,  m)-th  pixel  in  a  desired  region  of 
interest  (ROI)  (e.g.,  the  ROI  which  was  used  for  defining 
the  desired  signature  vector)  from  the  transformed  image. 
The  underlying  reason  for  using  the  expected  value  op¬ 
erator  in  defining  correction  of  partial  volume  averaging 
effects  by  Eq.  (4)  is  to  exclude  the  additive  noise  which 
contains  no  information  pertaining  to  these  effects.  An  al¬ 
ternative  definition  may  therefore  consist  of  using  a  noise¬ 
less  image  model  (pixel  vector)  in  Eq.  (4)  while  dropping 
the  expected  value  operator.  Either  definition  may  be  used 
to  test  correction  of  partial  volume  averaging  effects.  The 
first  definition  is  usually  more  appropriate  for  experimen¬ 
tal  work,  while  the  second  definition  is  sometimes  more 
appropriate  for  theoretical  development. 

3)  Signal-to- Noise  Ratio:  Linearly  trcinsformed  im¬ 
ages  are  linear  combinations  of  the  images  in  the  sequence, 
using  different  transformation  vectors.  Since  we  have  m 
signature  vectors  each  of  which  can  be  considered  as  the 
desired  signature  vectors,  there  are  a  total  of  m  different 
transformation  vectors  resulting  in  m  different  transformed 
images.  The  pixel  gray  levels  of  these  linearly  transformed 
images  {{LTId,  d=  ',771})  are  given  by 


Pjk  =  E[Pjk]  +  Vjk=f^{^  )Si  +  Tjjk  (3) 

where  Vijk  is  the  partial  volume  of  the  /-th  material  in  the 
{j,  i)-th  voxel,  and  rjjk  represents  statistical  noise  which 
is  assumed  to  be  an  additive  zero-mean  white  Gaussian 
noise  field,  uncorrelated  between  different  scenes  of  the 
same  MRI  sequence,  with  standard  deviation  a  (This  as¬ 
sumption  was  previously  made  and  justified  in  numerous 
ar«cles  including  [9]-[14].).  Note  that  E[Pjk]  is  determin¬ 
istic  but  unknown,  while  the  noise  rjjk  is  stochastic,  so  that 
the  pixel  gray  level  Pjk  is  the  sum  of  a  deterministic  value 
(to  be  estimated)  and  noise.  We  use  the  notation  E[Pjk\ 
to  denote  the  original,  deterministic  value  of  the  pixel  gray 
level,  which  contains  information  pertaining  to  partial  vol¬ 
ume  averaging  effects. 

Correction  of  partial  volume  averaging  effects  is  neces¬ 
sary  for  robust  interpretation  and  aniilysis  of  MR  inuiges, 
as  well  as  for  volume  calculations.  It  means  that  we  gen¬ 
erate  an  image  whose  pixel  gray  levels,  on  average,  are 
proportional  to  the  percentages  of  a  specific  tissue  in  the 

*  We  use  interfering  and  undesired  interchangeably  throughout  the 
paper. 


ETIjkd  —  y  ' TidPjki  —  ti  •  P jk,  t/  =  1,  •  •  • ,  m  (5) 

»=i 

where  LTIjkd  is  the  gray  level  of  the  {j,  ib)-th  pixel  in  the 
d-th  linearly  transformed  image,  t^  =  [Tu  T^d  TndY  is 
the  ^th  trmsformation  vector  to  be  determined,  and  T  = 
[ti,t2,  •  •  • ,  tn,]  is  the  transformation  matrix.  For  a  linear 
transformation  with  the  transformation  vector  tj,  and  the 
presence  of  an  additive  zero-mean  white  noise  field  with 
standard  deviation  <7  in  the  image  sequence,  the  SNR  of 
the  desired  feature  with  the  signature  vector  s^  is  expressed 
by  [15],  [16] 


SNRd  = 


td  •  Sj 
<T{td  •  tg)3  ' 


(6) 


4)  Optimal  Transformation:  We  seek  a  transforma¬ 
tion  that  achieves  the  following  objectives  simultaneously: 

•  Correction  of  partial  volume  averaging  effects; 

•  Maximizing  SNR  of  the  desired  feature. 

Theorems  1  and  2  in  the  next  section  establish  the  relation¬ 
ship  between  correction  of  partial  volume  averaging  effects, 
removal  of  the  interfering  features,  and  the  hnearity  of  the 
transformation,  and  then  find  the  solution. 


B.  Derivation  of  Solution 

Theorem  1  (i)  For  any  transfoT^ation,  correction  of 

partial  volume  averaging  effects  requires  removal  of  the  in¬ 
terfering  features;  (a)  correction  of  partial  volume  averag¬ 
ing  effects  can  be  achieved  only  by  a  linear  transformation: 
and  (Hi )  for  a  linear  transformation  correction  of  partial 
volume  averaging  effects  is  equivalent  to  removal  of  the  in¬ 
terfering  features. 

Proof  (i)  correction  of  partial  volume  averaging  effects 
requires  that  Eq.  (4)  hold  for  arbitrary  regions  of  inter¬ 
est  in  which  the  expected  values  (deterministic  portion)  of 
the  pixel  gray  levels  are  convex  combinations  of  the  signals 
from  the  corresponding  overlapping  tissues.  In  the  previ¬ 
ous  section,  we  described  two  definitions  of  correction  for 
partial  volume  averaging  effects.  Here,  we  use  the  second 
definition  which  utilizes  the  noiseless  model  for  the  MR! 
gray  levels  (Using  the  first  definition,  («)  cannot  be  proved 
in  general.).  For  the  noiseless  case,  a  pixel  vector  can  be 
represented  by 

m 

Pjfc  =  oidSd  +  ^  ociSi  (7) 

*2=1 

i^4. 

where  {o^,  i  =  are  partial  volumes  of  the  tis¬ 

sues  in  the  corresponding  voxel.  For  this  case,  Eq.  (4)  is 
simplified  to 

m 

TIjkd  =  =  Tdioid^d  +  ^  =  cidTdih)-  (8) 

isl 

Now  let  Qj  =  0/or  i  ^  I  and  aj  =  1  for  a  fixed  I  ^  d,  then 
Eq.  (8)  results  in 

Fdisi)  =  0.  (9) 

The  above  argument  can  me  made  for  each  of  the  interfer¬ 
ing  features.  This  shows  that  correction  of  psirtial  volume 
averaging  effects  requires  removal  of  the  interfering  fea¬ 
tures. 

(it)  Intuitively,  the  transformation  (demodulator) 
should  be  linear  since  partial  volume  information  is  iin- 
early  modulated  into  the  MRI  sign£d  (see  Eq.  (2)).  For  a 
formal  proof,  consider  ^lgain  the  general  representation  of 
a  noiseless  pixel  vector  given  by  Eq.  (7).  Then,  Eq.  (8)  can 
be  written  sis 

m 

Td(adS<j -1- ^aiS;)  =  cid'Fdi^d) 

isl 

m 

=  adFdisd)  +  ^  aiZi(si). 

isl 

i^d 

(10) 

The  last  equcdity  uses  the  necessity  of  removal  of  the  in¬ 
terfering  features  proven  in  (i).  Eq.  (10)  proves  the  lin¬ 
earity  of  the  desired  transformation.  Note  that  although 


{Td(s,)  =  0,  for  i  =  i  dj,  since  Td{sd)  ^  0 

non-linear  terms  cannot  be  added  to  the  right  hand  side  of 
Eq.  (10)  except  those  which  are  identically  zero;  the  resul¬ 
tant  transformation  is  then  effectively  linear  even  though 
it  may  have  a  non-linear  appearance. 

(Hi)  Need  to  show  that  for  a  linear  transformation  re¬ 
moval  of  the  interfering  features  requires  correction  of  par¬ 
tial  volume  averaging  effects.  This  is  verified  by  simplifying 
ElLTIjkd]  for  an  arbitrary  pixel  vector  from  the  image  se¬ 
quence.  Using  our  assumptions  regarding  a  priori  known 
signature  vectors,  presence  of  an  additive  zero-mean  white 
noise  field,  and  considering  as  the  transformation  vector 
for  the  linear  transformation,  we  have 

T7l 

E[LTIjkd]  =  •  (X^ciiS, -I- ifji)] 

t  =  l 
m 

=  E[Y^  Qitd  •  Sj  -f  td  •  rjjk] 
i  =  l 
m 

=  y^  aitd  •  Sj 
i  =  l 

=  OLdtd  •  Srf 

=  OidTdi^d)-  (11) 

In  the  above  we  have  used  the  removal  of  the  interfering 
features  and  the  linearity  of  the  expected  value  operator. 
Q.E.D. 

Considering  the  results  of  Theorem  1,  the  optimal  trans¬ 
formation  vector  should  maximize  the  SNRd  in  Eq.  (6) 
while  satisfying  the  constraints  tj  •  Sfc  =  0  for  i  = 
l,---,Tn,  k  d.  The  following  Theorem  gives  the  trans¬ 
formation  vector  eind  proves  its  optimality. 

Theorem  2  The  solution  to  the  problem 

Max.  [  SNRd  =  ,  (12) 

L  <T(td-td)iJ 

subject  to  the  constraint  that 

td  •  s*  =  0,  for  k  =  I,- ••  ,m,  k  d  (13) 

is  given  by 

td  =  Sd  -  (14) 

where  is  the  projection  of  Sd  onto  the  subspace  spanned 
{?ki  ^  ^  (undesired  subspace),  h 

addition,  td  co”  easily  be  computed  using  a  Gram-Schmidi 
orthogonalization  procedure. 

Proof  Any  n-dimensional  vector  in  general,  and  td  m 
particular,  can  be  represented  as  [17] 

t.  =  tY'+tf 
=  (tT'  +  tD  +  tf 

=  ^CiSi -kas^-hh 

i=l 
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where  and  are  projections  of  onto  the  subspace 
generated  by  the  signature  vectors  {feature  subspace)  and 
its  orthogonal  complement,  respectively.  Similarly,  t^* 
and  tj”'  are  projections  of  onto  the  undesired  subspace 
and  its  orthogonal  complement  in  the  feature  subspace,  re¬ 
spectively.  Since  is  proportional  to  (the  orthogonal 
complement,  in  the  feature  subspace,  of  the  projection  of 
Si  onto  the  undesired  subspace),  it  is  replaced  by  as^.  For 
convenience,  is  denoted  by  h.  Finally,  a  and  are 
scalar  coefBcients  to  be  determined. 

Assuming  the  undesired  signature  vectors  are  linearly 
independent,  the  constraint  that  the  undesired  signature 
vectors  have  zero  projections  onto  the  transformation  vec¬ 
tor  requires 


Cj  =  0,  for  t  =  l,---,m,  i  ^  d.  (16) 

This  requirement  can  be  mathematically  established  by 
substituting  Eq.  (15)  into  Eq.  (13)  followed  by  an  employ¬ 
ment  of  matrix  notations. 


which  is  equivalent  to 

hi  =  0,  for  t  =  1,  •  •  • ,  n  or  h  =  0  (21) 

and  a  sufficient  condition  is 


VlSNRd 


h=0 


-aS;-S,i 

,T|a|»||S5r 

0 

-as^  •  Sd 


cr  a  s 


3 II  113 


0  •• 

0 

1  0 
Lo  0 


-os;-s 


^|a|»||S°||» 


<  0 


(22) 


where  “  <  0  ”  stands  for  negative  definiteness.  This  con¬ 
dition  is  satisfied  for  any  a  >  0  since 


Sd-Sd  =  (Si  -  S^)  •  Sd 

=  IlSil^  -  ||s<j1^  COS^  e 

=  ||si|psin^0>  0  (23) 


td-st  =  (^c,Si -haJ^-l-h) -Sib 


=  ^ CjSj  •  ifc  =  0,  for  k  =  !,•  •  •  ,m,  k  ^  d 

isl 

(17) 

Define  the  n  x  (m  -  1)  matrix  U  by  putting  the  unde¬ 
sired  signature  vectors  {s*,  k  =  l,*'-,m,  k  ^  d}  in 
its  columns,  ^uld  the  (m  —  l)-dimen8ional  vector  C  using 
{ci,i  =!,•••,  m,  i  ^  d}  as  its  elements,  in  the  same  or¬ 
der  as  the  undesired  signature  vectors  used  in  defining  U. 
Then  Eq.  (17)  can  be  re-written  as 

(U^U)C  =  0.  (18) 


Since  the  undesired  signature  vectors  are  assumed  to  be  lin¬ 
early  independent,  U^U  has  rank  (m—  1)  and  the  imique 
solution  to  Eq.  (18)  is  C  =  0. 

Therefore,  c^ln  be  written  as  =  asj  -|-  h.  Substitut¬ 
ing  this  representation  of  t,f  in  Eq.  (12),  we  obtain 


SNRd  = 


td-Sd 
o-(td  •  td)» 


(T(a2|s'5P-h||h|P)i‘ 


(19) 


Eq.  (19)  shows  that  any  non-zero  h  increases  the  denom¬ 
inator,  hence  lowers  SNRd-  This  may  indicate  that  the 
maximum  SN Rd  is  obtained  for  h  =  0.  To  establish  this 
result  in  a  formal  mathematical  manner,  we  note  that  the 
necessary  condition  is 


V^SNRd 


-hiaSl-Sj, 

<T(a»||s;i=-H|hp)i 

-hnOS;.Sj 

.  <,(a>isj||>+||hp)l  . 


=  0 


(20) 


where  6  is  the  angle  between  Sd  and  its  projection  onto  the 
undesired  subspace.  Therefore, 

td  =  =  a(sd  -  s^),  a  >  0  (24) 

is  a  solution.  Since  a  is  a  scaling  factor,  it  can  be  set  equal 
to  one  without  loss  of  generality.  The  vector  Sj  can  be 
found  by 

s5  =  [I-U(U^U)-iU^]sd  (25) 

where  I  is  the  n  x  n  identity  matrix.  Eq.  (25)  yields  the 
solution  since  it  subtracts  from  Sd  its  projection  onto  the 
subspace  spanned  by  the  columns  of  U  (i.e.,  the  unde¬ 
sired  subspace).  This  calculation,  however,  needs  a  ma¬ 
trix  inversion  which  may  be  numerically  unstable.  Linking 
the  orthogonality  of  to  the  undesired  subspace  (vec¬ 
tors)  with  that  of  the  Gram-Schmidt  orthogonalization  [17] 
(which  is  numerically  stable),  it  becomes  evident  that 
may  be  found  using  the  Gram-Schmidt  method.  The  key 
point  is  that  in  each  step,  the  Gram-Schmidt  orthogonal¬ 
ization  removes  the  projection  of  the  new  vector  onto  the 
subspace  defined  by  the  previous  vectors^.  Therefore,  by 
using  {sjk,  k  =  k  ^  d}inan  arbitrary  order 

and  then  Sd  in  a  Gram-Schmidt  orthogonalization,  the  last 
output  vector  will  be  (sd  —  sj)  which  is  exactly  what  we 
w£uit.  The  vector  td  is  always  non-zero  except  for  the  case 
in  which  Sd  is  parallel  to  the  undesired  subspace  which  is 
very  unlikely  to  happen.  Q.E.D. 

1)  Existence  of  the  Solution:  In  order  to  guarantee 
the  existence  of  the  transformation  vectors,  the  signature 
vectors  should  be  linearly  independent.  This  requires  that 
the  number  of  unique  images  in  the  sequence  (n)  be  greater 
than  or  equal  to  the  number  of  signature  vectors  (m).  Here, 
a  unique  image  is  one  that  is  not  a  linear  combination  of 
other  images  in  the  sequence. 

^Details  of  the  Gram-Schmidt  orthogonalization  may  be  found  in 
most  of  the  linear  algebra  textbooks  as  well  as  [17]. 
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III.  RESULTS 

A.  Advantages  of  Our  Approach 

Composite  images  generated  by  the  optimal  transforma¬ 
tion  are  similar  to  the  eigenimages  generated  by  solving 
generalized  eigenvalue  problems  [6].  It  can  be  mathemat¬ 
ically  shown  that  the  weighting  vector  for  the  eigenimage 
filtering  with  one  or  two  interfering  features  approaches 
to  the  transformation  vector  for  our  optimal  transforma¬ 
tion  as  the  regularizing  parameter  w  (defined  in  [6])  tends 
to  zero.  Hence,  this  work  may  be  considered  as  a  new 
approach  to  the  derivation  of  the  eigenimage  filter  as  a 
transformation,  with  several  advantages  including: 

1.  H^eH-de^ncd  contrast  criteria; 

2.  No  need  for  the  costly  numerical  solutions  to  the  gen- 
ereilized  eigenvalue  problems; 

3.  Straightforward  analytical  solution  for  the  general  case 
of  multiple  interfering  features; 

4.  i^ast  and  numerically  stable  calculation  of  the  weight¬ 
ing  vectors  using  the  Gram-Schmidt  orthogonaliza- 
tion; 

5.  Exact  correction  for  partial  volume  averaging  effects; 

6.  Explicit  and  simple  expression  for  the  SNR  of  the 
eigenimage,  for  the  case  of  known  or  well-estimated 
signature  vectors  and  additive  zero-mean  equi-power 
white  noise; 

7.  Suggesting  the  normalization  of  the  original  images 
to  the  standard  deviation  of  noise  to  yield  the  equi- 
power  white  noise  Ccise  for  which  the  maximum  SNR 
is  obtained  by  the  eigenira2ige  filter. 

B.  Examples 

We  use  MRI  simulation,  phantom,  and  brain  images  to  il¬ 
lustrate  and  evaluate  the  optimal  transformation.  In  gen¬ 
erating  the  simulation,  each  object  (region)  is  assigned  a 
signature  vector  equal  to  that  of  a  normal  brain  tissue 
(white  matter,  gray  matter,  or  cerebrospinal  fluid  (CSF)). 
These  signature  vectors  are  estimated  by  averaging  image 
gray  levels  of  normal  brain  tissues  in  a  four-echo  multi¬ 
ple  spin-echo  protocol  with  TE/TR  =  25, 50, 75, 100/2500 
msec.  As  explained  in  Section  II,  these  signature  vectors 
are  non-linear  functions  of  tissue  and  pulse  sequence  pa¬ 
rameters  and  thus  are  linearly  independent.  Regions  of 
partial  volume  averaging  effects  are  included  into  the  simu¬ 
lation,  where  the  fraction2il  components  of  the  neighboring 
objects  are  known  on  a  pixel-by-pixel  basis  (They  change 
linearly  from  0%  to  100%  for  each  tissue,  from  one  row 
to  the  next  or  from  one  column  to  the  next.).  Zero-mean 
white  Gaussian  noise  with  a  standard  deviation  of  0.6  was 
added  to  the  simulation. 


Figure  1:  Original  images  of  the  simulation,  (a)-(d)  Four 
multiple  spin-echo  images  with  TE/TR  =  25,  50,  75, 
100/2500  msec.  ROIs  used  for  estimating  signature  vec¬ 
tors  are  shown  in  image  (b). 


Figure  2:  Transformed  imeiges  of  the  simulation,  (a)-(c) 
Transformed  images  for  the  central  region,  the  region  on 
the  right,  and  the  region  on  the  left,  respectively. 


Table  I 

Original  (org)  and  Estimated  (est)  Values  of  Partial  Volumes 
in  the  Simulation. 


Figure  4:  Transformed  images  of  the  egg  phantom,  (a)-(c) 
Transformed  images  for  egg  yolk,  egg  white,  and  gelatin, 
respectively. 


Figure  3:  Original  images  of  an  egg  phantom,  (a)-(d) 

Four  multiple  spin-echo  images  with  TE/TR  =  25,  50,  75, 

100/2500  msec.  ROIs  used  for  estimating  signature  vec¬ 
tors  are  shown  in  image  (b).  Note  that  the  zipper  Mtifact 
in  these  images  is  due  to  the  instrument  imperfection  and  Figure  5:  An  origin2d  and  three  transformed  images  for 
thus  is  not  considered  as  a  feature  (tissue  type).  As  a  re-  the  egg  phantom,  (a)  The  first  original  image  of  the  egg 
suit,  it  is  projected  to  the  transformed  imciges  shown  in  phantom,  (b)-(d)  Three  linearly  transformed  images  for 
Fig-  the  egg  white,  all  of  which  correct  for  partial  volume  aver¬ 


aging  effects  but  only  image  (d)  maximizes  the  SNR. 
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Figure  6:  Original  images  of  a  human  brain,  (a)-(c)  First  ,  .  ,  , 

three  images  of  a  four-echo  multiple  spin-echo  image  se-  Fig'^re  8;  A  three-dimensional  view  of  the  egg  phantom 

quence  with  TE/TR  =  19,  38,  57,  76/1500  msec,  (d)  An  ^ith  the  partial  volume  regions  visualized. 

inversion  recovery  image  with  TE/TI/TR  =  12/519/2000 

msec.  ROIs  used  for  estimating  signature  vectors  are 

shown  in  image  (b). 


Figure  7:  Transformed  images  of  the  human  brain,  (a)-(c) 
Transformed  images  for  white  matter,  gray  matter,  and 
cerebrospinal  fluid  (CSF),  respectively. 


Figure  9:  A  three-dimensional  view  of  the  human  brain 
with  the  partied  volume  regions  visualized.  Note  that  the 
non-real  objects  seen  outside  of  the  head  are  due  to  the 
flow  artifacts  generated  during  the  image  acquisition. 


r 


The  simulated  images  are  shown  in  Fig.  1.  They  simu¬ 
late  pure  and  partial  volume  regions  for  white  matter,  gray 
matter,  and  CSF.  Regions  of  interest  have  been  drawn  on 
the  pure  portions  of  each  object  to  estimate  signature  vec¬ 
tors  and  calculate  the  transformation  matrix.  These  re¬ 
gions  of  interest  are  shown  in  Fig.  1(b),  and  the  resulting 
transformed  images  are  shown  in  Fig.  2.  Each  transformed 
image  illustrates  segmentation  of  the  corresponding  desired 
feature  from  the  interfering  features  and  visualizes  its  par¬ 
tial  volume  information.  Several  partial  volume  percent¬ 
ages  of  each  feature  are  estimated  using  Eq.  (4),  by  drawing 
horizontal  or  vertical  lines  (ROIs)  on  the  partial  volume  re¬ 
gions  and  calculating  the  sample  means  (estimates  of  the 
expected  values)  for  these  ROIs  and  dividing  the  results 
by  the  sample  means  of  ROIs  drawn  on  the  corresponding 
pure  regions.  The  results  are  summarized  in  Table  I.  The 
original  (those  used  in  generating  the  simulation)  and  es¬ 
timated  values  are  in  close  agreement.  Small  differences 
are  due  to  the  limitation  in  the  number  of  noisy  pixels 
available  for  the  estimation. 

Original  and  transformed  images  of  an  egg  phantom  are 
shown  in  Figs.  3  and  4,  respectively.  The  partial  volumes 
between  egg  white  and  egg  yolk,  and  between  egg  white 
and  gelatin  are  not  known  on  a  pixel-by-pixel  basis,  but 
their  averages  may  be  estimated  by  the  water  displace¬ 
ment  method  [2].  These  partial  volumes  are  visualized  in 
the  transformed  images  in  Fig.  4.  An  original  image  of 
the  egg  phantom  and  three  linearly  transformed  images 
for  the  egg  white  are  shown  in  Fig.  5.  All  of  the  trans¬ 
formed  images  satisfy  the  condition  for  correcting  partial 
volume  averaging  effects,  i.e.,  Eq.  (4),  but  they  have  dif¬ 
ferent  SNRs.  They  illustrate  the  need  for  considering  SNR 
maximization  in  the  problem  formulation  and  its  effect  on 
the  transformed  image  quality. 

Figs.  6  and  7  show  the  original  and  transformed  images 
of  a  normal  human  brain,  respectively.  The  desired  feature 
is  segmented  and  the  partial  volumes  between  different  tis¬ 
sues  are  visualized  in  the  transformed  images  in  Fig.  7. 

The  transformed  images  from  several  slices  through  the 
object  can  be  used  for  three-dimensional  (3-D)  visualiza¬ 
tion.  An  advantage  of  using  these  optimedly  transformed 
images  is  that  they  enable  us  to  segment  regions  of  partial 
volume  averaging  effects  from  regions  of  pure  materials. 
Figs.  8  and  9  show  3-D  images  of  £in  egg  phantom  and  a 
human  brain,  in  which  regions  of  pure  and  partial  volumes 
are  distinguished  from  each  other.  Deteiils  of  the  procedure 
for  generating  these  3-D  images  are  presented  in  [18],  [19]. 

IV.  SUMMARY  AND 
CONCLUSION 

The  optimal  transformation  for  simultaneous  correction  of 
partial  volume  averaging  effects  cind  maximizing  SNR  was 
derived.  No  linearity  assumption  was  initially  made  for  the 
transformation.  The  required  properties  for  the  transfor¬ 
mation  were:  (i)  correcting  for  partial  volume  averaging 


effects;  and  (u)  maximizing  the  SNR  of  a  desired  feature 
in  the  transformed  image.  It  was  shown  that  property  (i) 
requires  removal  of  the  interfering  features  from  the  trans¬ 
formed  image.  Using  this,  it  was  proved  that  property  (i) 
can  only  be  achieved  by  a  linear  transformation.  It  was 
finally  shown  that  the  optimal  transformation  matrix  can 
be  easily  and  numerically  stably  obtained  using  the  Gram- 
Schmidt  orthogonalization  procedure. 

For  the  mathematical  development,  we  assumed,  based 
on  physics  of  MRI,  that  the  signature  vectors  were  linearly 
independent.  Although,  we  did  not  address  a  situation  in 
which  the  original  signature  vectors  were  linearly  depen¬ 
dent,  it  can  be  shown  that  as  long  as  the  desired  signature 
vector  is  hnearly  independent  from  the  undesired  signature 
vectors,  the  corresponding  optimal  transformation  vector 
will  correct  its  partial  volumes.  However,  the  partial  vol¬ 
ume  information  of  the  feature  whose  signature  vector  is 
linearly  dependent  on  the  rest  of  the  signature  vectors  can¬ 
not  be  corrected.  This  is  simply  because  in  this  situation, 
there  are  more  unknowns  than  pieces  of  information.  This 
result  may  also  be  observed  from  the  fact  stated  at  the  end 
of  the  Proof  to  Theorem  2,  i.e.,  tj  =  0  in  this  case. 

Each  column  of  the  optimal  transformation  matrix  is 
similar  to  a  weighting  vector  for  the  eigenimage  filter. 
Therefore,  this  work  may  also  be  considered  as  a  new 
approach  to  the  derivation  of  the  eigenimage  filter,  with 
several  advantages  including  well-defined  contrast  criteria, 
straightforward  analytical  solution  along  with  fast  and  nu¬ 
merically  stable  calculation  of  the  weighting  vectors,  and 
exact  correction  for  partial  volume  averaging  effects.  We 
believe  that  the  results  of  the  new  approach  sheds  new 
light  on  the  extended  applications  of  the  eigenimage  filter 
to  Veirious  clinical  and  industrial  problems.  It  also  estab¬ 
lishes  the  applicability  of  the  eigenimage  filter  to  fast  and 
accurate  volume  determinations. 

We  used  MRI  simulation,  phantom,  and  brain  images 
to  illustrate  and  evaluate  the  optimal  transformation.  We 
showed  that  in  eill  Ccises  the  desired  feature  was  segmented 
from  the  interfering  features  and  partial  volume  informa¬ 
tion  was  visualized  in  the  resulting  transformed  images. 
Using  the  transformed  images,  the  partial  volumes  of  dif¬ 
ferent  features  in  the  simulation  were  estimated.  They 
were  within  1.33%  of  the  actual  partial  volumes. 

In  all  examples,  signature  vectors  were  estimated  by  av¬ 
eraging  several  pixels  (more  than  50)  in  pure  regions  of  ob¬ 
jects  (tissues).  In  a  previous  publication  [20],  we  showed 
that  this  method  yields  satisfactory  estimates  of  the  sig¬ 
nature  vectors.  A  stochastic  error  propagation  einalysis 
simileir  to  that  of  [20]  may  be  used  to  assess  inaccuracy  of 
the  transformed  images  when  signature  vectors  are  noisy. 
A  deterministic  analysis  similar  to  that  of  [2]  may  be  used 
when  signature  vectors  are  contaminated,  i.e.,  a  pure  ROI 
for  each  object  is  not  selected.  These  analyses  were  beyond 
the  scope  of  this  paper. 

The  constrained  optimization  problem  formulated  in 
Theorem  2  and  its  analytical  solution  can  be  modified  to 
design  other  useful  transformations.  An  example  may  be 


found  in  [21],  where  we  used  it  to  design  the  optimal  hn- 
ear  filter  for  maximizing  the  contrast-to-noise  ratio  (CNR) 
between  a  desired  feature  and  multiple  interfering  features 
in  MRI. 
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Jibstract 

Segmentation  of  a  feature  of  interest  while  correcting  for 
partial  volume  averaging  effects  b  a  major  tool  for  iden¬ 
tification  of  hidden  abnormalities,  fast  and  accurate  vol¬ 
ume  calculation,  and  three-dimensional  visueilbation  in  the 
field  of  magnetic  reson2uice  imaging  (MRI).  We  present  the 
optimal  transformation  for  simultaneous  segmentation  of 
a  desired  feature  (SDF)  and  correction  of  partial  volume 
averaging  effects  (CPV),  while  maximizing  the  signal-to- 
nobe  ratio  (SNR)  of  the  desired  feature.  It  is  proved  that 
CPV  requires  the  removal  of  the  interfering  features  from 
the  scene  (RIF).  It  is  ako  proved  that  CPV  can  be  achieved 
merely  by  a  linear  transformation.  It  is  finally  shown  that 
the  optimal  transformation  matrix  is  easily  obtained  us- 
mg  the  Gram-Schmidt  orthogonalbation  procedure,  which 
is  numerically  stable.  Applications  of  the  technique  to 
MRI  simulation,  phantom,  and  brain  images  were  shown  in 
the  presentation.  We  showed  that  in  all  cases  the  desired 
feature  was  segmented  from  the  interfering  features  and 
partial  volume  information  was  visualbed  in  the  resulting 
transformed  images. 

L  INTRODUCTION 

Segmentation  of  a  feature  of  interest  while  correcting  for 
partial  volume  averaging  effects  b  a  major  tool  for  MRI 
unage  analysis  and  interpretation.  Its  applications  include 
identification  of  hidden  abnormalities  [1],  fast  and  accurate 
volume  calculation  [2],  sind  three-dimensional  vbualuation 
[3].  In  an  MRI  sequence  consbting  of  several  images  of  the 
same  anatomical  site,  the  image  gray  leveb  corresponding 
•0  different  tissue  t3rpes  change  chaxacterbticaUy  through¬ 
out  the  image  sequence  and  contain  information  pertaining 
to  partial  volume  averaging  effects.  This  makes  it  possi¬ 
ble  to  generate  a  set  of  transformed  images  in  which  the 
partial  volumes  of  each  feature  are  vbualized. 

We  derive  the  optimal  transformation  for  correcting  par- 
-ial  volume  averaging  effects.  Optimality  b  defined  as  max¬ 
imizing  the  SNR  of  the  desired  feature,  i.e.,  the  feature 
whose  partial  volumes  are  corrected  and  vbualized.  In 


Section  II,  after  defining  notations  and  CPV,  we  establish 
the  relationship  between  CPV  and  RIF.  This  relationship 
b  used  to  show  that  CPV  can  be  achieved  merely  by  a 
lineiir  transformation.  Then  we  find  the  optimal  transfor¬ 
mation  matrix  using  the  Gram-Schmidt  orthogonahzation 
procedure.  In  Section  III,  we  present  advantages  of  the  new 
approach  to  the  previous  one  for  deriving  the  eigenimage 
filter  [4].  In  the  presentation,  we  showed  apphcations  of 
the  technique  to  simulation,  phantom,  and  brain  images. 
Conclusions  are  given  in  Section  IV. 


II.  METHODS 

A.  Problem  Formulation 

1)  Notations:  Let  V  and  VV  be  n-dimension2il  and 

TTi-dimensional  real  vector  spaces,  respectively.  Then 
points  in  V  and  W  are  vectors  in  It"  and  'RV',  respec¬ 
tively.  Let  n  be  the  number  of  images  in  the  sequence  and 
m  be  the  number  of  transformed  images  which  equab  the 
number  of  features  (objects)  in  the  scene.  Then  a  pixel 
vector  Pyj  =  [Pyu  Pjt,2  '  •  •  b  an  n-dimensional 

vector  whose  elements  are  the  gray  leveb  of  the  (j,  k)-th 
pixeb  in  the  images  in  the  sequence.  The  n-dimensional 
vectors  {sj  =  [sji  sjj  •  •  •  ai„]^,  /=!,•••,  m}  whose  i-th  el¬ 
ement  defines  a  specific  feature  in  the  i-th  image  are  called 
signature  vectors.  The  vector  sj  (1  <  d  <  m)  represents 
the  desired  feature,  and  each  of  the  other  signature  vectors 
represents  an  interfering  feature.  A  pixel  in  a  trimsformed 
unage  (TJ)  b  a  function  of  all  pixeb  at  the  same  location 
in  the  original  images,  i.e., 

T/y*  =  T(Py*)  (1) 

where  T/y*  b  the  gray  level  of  the  {j,  k)-th  pixel  in  the 
transformed  image,  and  T  b  the  function  to  be  found. 
For  a  linear  transformation,  T(Pyjk)  =  t  •  Pyi,  where  t  = 
[f  1  <2  •  •  •  b  the  transformation  (weighting)  vector  to 
be  determined. 

2)  Correction  for  Partial  Volume  Averaging  Effects: 
The  MR  signal  5  from  a  voxel  containing  m  different  ma- 
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(2) 


terials  is  given 


by  [5] 


1=1 


where  V,  is  the  volume  of  the  i-th  material  within  the  voxel, 
is  the'total  volume  of  the  voxel,  and  S,  is  the  signal  from 
the  i-th  material.  The  gray  level  Pjk  of  the  {j,  k)-th  pixel 
(corresponding  to  the  (j,fc)-th  voxel)  in  an  MR  image  is 
given  by 

Pjk  =  E[P^k]  +  V]k  =  (3) 

1=1 


where  Vijk  is  the  partial  volume  of  the  l-th  material  in  the 
(j,  lfe)-th  voxel,  and  Tjjk  represents  statistical  noise  which  is 
assumed  to  be  an  additive  zero-mean  white  Gaussian  noise 
field  with  standard  deviation  a.  Note  that  E[Pjk]  is  deter¬ 
ministic  but  unknown,  while  the  noise  rjjk  is  stochastic,  so 
that  the  pixel  gray  level  Pjk  is  the  sum  of  a  deterministic 
function  (to  be  estimated)  and  noise.  We  use  the  notation 
E[Pjk]  to  denote  the  original,  deterministic  value  of  the 
pixel  gray  level. 

CPV  means  that  we  generate  an  image  whose  pixel  gray 
levels  are  proportional  to  the  percentage  of  a  specific  tissue 
in  each  voxel  on  average.  Mathematically,  this  translates 
to  generating  a  transformed  image  (TI)  in  which 

E[TIjk]  =  (^)^[T(Pf„)]  (4) 

where  E[TIjk]  is  the  mean  value  of  the  (j,  A!)-th  pixel  in 
the  transformed  image,  Vdjk  is  the  partial  volume  of  the 
desired  material  in  the  ij,k)-th.  voxel,  and  £^[T(Pf„)]  is 
the  mean  value  of  the  {I,  m)-th  pixel  in  the  desired  region 
of  interest  (ROI)  of  the  transformed  image.  This  correction 
is  necessary  for  robust  interpretation  and  analysis  of  MR 
images,  as  well  as  for  volume  calculations. 


the  image  sequence,  the  SNR  of  the  desired  feature  witfi 
the  signature  vector  s;  is  expressed  by  [6] 


SNRi  = 


t;  •  s; 
0-(tj  •  t;)^ 


(6) 


4)  Optimal  Transformation:  We  seek  a  transforma¬ 
tion  that  achieves  the  following  objectives  simultaneously; 

a  CPV,  i.e.,  satisfying  Eq.  (4); 

a  Maximizing  SNR  of  the  desired  feature,  i.e.,  maximiz- 
ing  Eq.  (6). 

Theorems  1  and  2  in  Section  B  stablish  the  relationship  be¬ 
tween  CPV,  RIF,  and  the  hnearity  of  the  transformation, 
and  then  find  the  solution. 


B.  Derivation  of  Solution 

Theorem  1  (i)  For  any  transformation,  CPV  requires 

RIF;  (ii)  CPV  can  be  achieved  only  by  a  linear  transfor¬ 
mation;  and  (Hi )  for  a  linear  transformation  CPV  is  equiv¬ 
alent  to  RIF. 

Proof  Due  to  the  page  limitations,  we  do  not  give  the 
proof  here;  it  b  given  in  [7]. 

Considering  the  results  of  Theorem  1,  the  optimal 
weighting  vector  should  maximize  the  above  SNRi  while 
satisfying  the  constraints  tfSk  =0,  for  fc  =  1,  •  •  ■  ,m,  k 
1.  The  following  Theorem  gives  the  transformation  vector 
and  proves  its  optimality. 

Theorem  2  The  solution  to  the  problem 


Max. 


SNRi  = 


t|  •  S|  ■ 
o-(ti  •  tj)>  . 


subject  to  the  constraint  that 

tj-Sfc=0,  for  k=l,---,m,  kf:l 


(7) 

(8) 


3)  Signal-to-Noise  Ratio:  Linearly  transformed  im¬ 
ages  are  linear  combinations  of  the  images  in  the  sequence, 
using  different  transformation  vectors.  Since  we  have  m 
signature  vectors  each  of  which  can  be  considered  as  the 
desired  signature  vectors,  there  are  a  total  of  m  different 
transformation  vectors  resulting  in  m  different  transformed 
images.  The  pixel  gray  levels  of  these  linearly  transformed 
images  {{LTh,  1  =  1,  •  •  ■ ,  m})  are  given  by 

n 

LTIjki  =  =  t/  •  Pf  i 1  =  1.  •  •  ■ .  m  (5) 

«=i 

where  LTIjki  is  the  gray  level  of  the  {j,  k)-th  pixel  in  the 
f-th  linearly  transformed  image,  tj  =  [Tu  T21  ■■■  T„j]^ 
is  the  f-th  transformation  vector  to  be  determined,  and 
T  =  [ti,  tj,  •  •  ■ ,  tm].  For  a  linear  transformation  with  the 
transformation  vector  tj,  and  the  presence  of  an  ewiditive 
zero-mean  white  noise  field  with  stcindard  deviation  c  in 


is  given  by 

tj  =  S|  -  sf  (9) 

where  sf  is  the  projection  of  si  onto  the  subspace  spanned 
by  {hi  k  =  I,--  -  ,m,  k  1).  In  addition,  t;  can  eas¬ 
ily  be  computed  using  a  Gram-Schmidt  orthogonalization 
procedure. 

Proof  Due  to  the  page  limitations,  we  do  not  give  the 
proof  here;  it  is  given  in  [7],  [8]. 

1)  Existence  of  the  Solution:  In  order  to  guarantee 
the  existence  of  the  transformation  vectors,  the  number 
of  unique  images  in  the  sequence  (n)  must  be  equal  to  or 
greater  than  the  number  of  signature  vectors  (m).  Here, 
a  unique  image  is  one  that  is  not  a  linear  combination  of 
other  images  in  the  sequence,  i.e.,  it  contains  information 
not  present  in  other  images. 
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III.  RESULTS 

Composite  images  generated  by  the  optimal  transforma- 
jioD  are  similar  to  the  eigenimages  generated  by  solving 
^enerahzed  eigenvalue  problems  [4].  Hence,  this  work  may 
y  considered  as  a  new  approach  to  the  derivation  of  the 
jigenimage  filter  as  a  transformation,  with  several  advan- 
^ts  including: 

1.  Well-defined  contrast  criterion; 

2.  No  need  for  the  costly  numenca/ solutions  to  the  gen¬ 
eralized  eigenvalue  problems; 

3.  Straightforward  anaiytica/solution  for  the  general  CMe 
of  multiple  interfering  features; 

4.  Fast  and  numerically  stable  calculation  of  the  weight¬ 
ing  vectors  using  the  Gram-Schmidt  orthogonaliza- 
tion; 

5.  Exact  correction  for  partial  volume  averaging  effects; 

6.  Explicit  and  simple  expression  for  the  SNR  of  the 
eigenimage,  for  the  case  of  known  or  well-estimated 
signature  vectors  and  additive  zero-mean  equi-power 
white  noise; 

7.  Suggesting  the  normalization  of  the  original  images 
to  the  standard  deviation  of  noise  to  yield  the  equi- 
power  white  noise  case  for  which  the  maximum  SNR 
is  obtained  by  the  eigenimage  filter. 

;  Examples 

Applications  of  the  optimal  transformation  to  the  MRI 
anulation,  phantom,  and  brain  images  were  shown  in  the 
lesentation,  we  do  not  include  them  here  due  to  the  page 
Dilations.  We  showed  that  in  all  cases  the  desired  feature 
tas  segmented  from  the  interfering  features  and  partial 
volume  information  was  visualized  in  the  resulting  trans- 
:5tmed  images.  Volume  calculations  for  agarose  and  egg 
;iantoms  were  also  presented. 

IV.  SUMMARY 

•M  optimal  transformation  for  simultaneous  CPV  and 
=^«inuzmg  SNR  was  derived.  No  linearity  assumption 
<a  imfially  made  for  the  transformation.  The  required 
^wpetties  for  the  transformation  were:  (i)  maximizing  the 
••'R  of  a  desired  feature  in  the  transformed  image;  and 
")  correctmg  for  partial  volume  averaging  effects.  It  was 
that  property  (ii)  requires  removal  of  the  interfer¬ 
es  features  from  the  transformed  image.  Using  this,  it 
'« shown  that  property  (u)  can  only  be  achieved  by  a 
transformation.  It  was  finally  proved  that  the  opti- 
^  transformation  matrix  is  easily  and  numerically  stably 
using  the  Gram-Schmidt  orthogonalization  pro- 


Each  column  of  the  optimal  transformation  matrix  is 
similar  to  a  weighting  vector  for  the  eigenimage  filter. 
Therefore,  it  may  be  considered  as  a  new  derivation  of 
the  eigenimage  filter  with  several  advantages.  We  believe 
that  the  results  of  the  new  derivation  sheds  new  light  on 
the  extended  applications  of  the  eigenimage  filter  to  vari¬ 
ous  clinical  or  industrial  problems.  It  also  establishes  the 
applicability  of  the  eigenimage  filter  for  fast  and  accurate 
in  vivo  volume  determinations. 
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APPENDIX  R1 

H.  Soltanian-Zadeh,  R.  Saigal,  J.P.  Windham,  A.E.  Yagle,  and  D.O.  Hearshen, 
“Optimization  of  MRI  Protocols  and  Pulse  Sequence  Parameters  for  Eigenim- 
age  Filtering,”  revision  submitted  to  IEEE  Trans.  Medical  Imaging,  July  1993. 

This  paper  proposes  a  procedure  for  optimizing  the  acquisition  of  MRI  scene  se¬ 
quences,  if  eigenimage  filtering  (see  Appendix  P)  is  then  used  to  process  the  MRI  scene 
sequence. 
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Optimization  of  MRI  Protocols  and  Pulse 
Sequence  Parameters  for  Eigenimage  Filtering 


eigenimage  filter  generates  a  composite  imasre 
m  which  a  desired  feature  is  segmented  from  interferiSTeatum 
The  signal-to-noise  ratio  (SNR)  of  the  eigenimage 
contrast-to-no.se  ratio  (CNR)  and  is  directly  proportSS  to 
the  dissimilarity  between  the  desired  and  inirferhS 
Since  image  gray  levels  are  analytical  ftinctions  of*  magnetic 
resonance  imajging  (MRI)  parameters,  it  is  possible  to  tSSS 

hVin  optimizing  these  parameters.  For  optimiza- 

mn  we  consider  four  MRI  pulse  sequences:  multiple  spta^o 

recovery  (ER);  id  iadient- 
ec  (GE).  We  the  mathematical  expressions  for  MRI  sioimk 
with  intrinsic  tissue  parameters  to  express  the  objective 
function  (normalized  SNR  of  the  eigenimage)  in  terms  of  MRI 
parameters.  The  objective  function  Ing^  a  set 
or  iMtrumental  constraints  define  a  multWimensional  nSiear 
constrained  optimization  problem,  which  we  solve  by^hStorf 
pomt  approach.  The  optimization  technique  Is  demonstrated 
t^ugh  Its  application  to  phantom  and  brain  images.  We  show 
that  the  optim^  pulse  sequence  parameters  for  a^uJice  of  four 

Lh,  JitotJiL “>  ‘he  conventional 


I.  Introduction 
Background  and  Motivation 

y  HE  EIGENIMAGE  HLTER  maximizes  the  projection  of 
JL  a  desired  feature  while  minimizing  the  projections  of  the 
undesired  (interfenng)  features  in  a  composite  image  called 
an  eigenimage  (El)  [1],  It  has  been  shown  that  the  eigenimage 
transformation  for  magnetic  resonance  imaging 
scene  segmentation  that  maximizes  signal-tt^noise  ratio 
SNR)  while  correcting  for  partial  volume  averaging  effects 
W.  [3]  Smce  m  the  eigenimage  the  desired  feature  appears 
bnght,  the  interfering  features  appear  dark,  and  the  partial 
volumes  of  the  desired  feature  are  visualized,  viewing  the 
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■  MRI  interpretation  of  the 

;  Mm  scene.  Fig.  1  illustrates  the  application  of  the  eigenimage 
,  filter  for  segmentmg  gray  matter  from  white  matter  and 
cerebrospinal  fluid  (CSF)  in  brain  images. 

Mathematically,  the  eigenimage  is  a  weighted  sum  of  the 
images  m  sequence.  To  explain  the  derivation  of  the 

s2STf  wt'’  ““k**  Euclidean  vector 

P  ,  here  n  is  the  number  of  images  in  the  MRI 
scene  sequence.  For  example,  when  dealing  with  an  MRI  scene 
syuen^  consisting  of  a  Tl-weighted  anS  four  r2-wei^ted 
spm-ecto  images,  we  use  a  five-dimensional  EucUdean  vector 
space.  Then,  an  MRI  scene  sequence  is  represented  by  pixel 
vectors.  A  pixel  vector  P,*  =  [P,iiF>i2  •  •  •  P,*nF  U  a 
v^tor  whose  elements  are  the  coriesW^^g  graj  liell  of 
0,  k)-th  pixels  m  the  MR  images  (see  Fig.  1).  The  image 

256  X  256  images  there  are  2i«  pixel  vectors.  The  MRI 
c  aractenstics  of  tissue  types  are  represented  by  signature 
vectors.  For  unage  analysis,  one  is  normally  interested  in 
Cleary  visualizing  one  of  the  tissue  types  (referred  to  as 

as  the  undesired  or  interfering  features)  interfere  with  its 
visualization.  A  desired  signature  vector  d  =  [did2--d  1^  is 
defined  as  a  vector  whose  i-th  element  is  the  average  gray  level 
of  the  desired  feature  in  the  z  -th  image.  Undesired  (interring) 
signature  vectors  u,  =  1  <  z  <  m'  (m'  is 

e  nun^r  of  mterfermg  features),  are  similarly  defined  for 
the  mterfermg  features  (see  Fig.  1). 

A  pixel  in  the  eigenimage  (El,*)  is  a  linear  combination  of 
all  pixels  at  the  same  location  in  the  MR  images,  i.e., 

n 

Eljfc  =  =  e-P,*  (1) 


where  El,*  is  the  gray  level  of  the  (j*)-th  pixel  in  the 

®  =  [eie2  -  e„]^  is  the  weighting  vector 
to  be  determmed. 

To  determine  the  weighting  vector,  the  SNR  of  the  desired 
teature  is  maximized 


Max.  SNR  = 


r(e  •  e)i 


subject  to  the  constraint  that 


e  •  u,  =  0,  for  1  =  1 . rn'. 
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Fig.  1.  Illustration  of  the  eigenimage  filtering  using  four  spin-echo  T2-weighted  images  (TE/TR  =  25-100/2^00  msec)  and  an  inversion  recovery  image 
(TE/TI/TR  =  20/600/1,500  msec)  of  a  human  brain.  Original  images  with  the  desired  and  undesired  ROIs  are  shown  to  the  left,  die  gray  matter  eigenimage 
is  shown  to  the  right.  Signanire  and  weighting  vectors  are  graphed  at  the  bottom. 


The  solution  to  the  above  constrained  optimization  problem 
is  given  by  [2] 

e  =  do  =  d  -  dp  (4) 

where  dp  and  do  are  the  projection  of  d  onto  the  sub¬ 
space  spanned  by  {u,,i  =  and  its  orthogonal 

complement,  respectively.  It  can  be  shown  that  dp  = 
U{U'’'U)~^U'^d  where  C/  is  the  n  x  m'  matrix  defined  by  the 
undesired  signature  vectors,  i.e.,  U  =  [ui,  U2, . . . ,  u„,>].  The 
weighting  vector  e  can  be  computed  using  the  numerically- 
stable  Gram-Schmidt  orthogonalization.  The  solution  is  always 
nonzero  unless  d  is  linearly  dependent  on  {ui,  i  =  1, ....  m'}, 
which  is  very  unlikely  to  happen  in  practice. 

As  explained  in  [2]  and  [3],  a  major  advantage  of  the 
eigenimage  filter,  as  compared  to  other  image  combination 
techniques,  is  its  uniqueness  in  correcting  for  partial  volume 
averaging  effects.  In  the  presence  of  the  multiple  interfering 
features,  other  filters  (e.g..  the  maximum  contrast  filter)  can 
neither  zero  all  of  the  interfering  features  nor  correct  for  partial 
volume  averaging  effects. 

For  the  case  of  well-defined  signature  vectors,  the  SNR  of 
the  desired  feature  in  the  eigenimage  equals  the  contrast-to- 
noive  ratio  iCNRi  betueen  the  desired  feature  and  anv  of  the 


interfering  features  and  is  expressed  by  [2],  [3] 

SNR  =  CNR=  it^sin(6')  (5) 

(7 

where  d  is  the  desired  signature  vector,  <7  is  the  standard 
deviation  of  the  additive  white  noise,  and  9  is  the  angle 
between  the  desired  signature  vector  and  its  projection  onto 
the  undesired  subspace  (subspace  defined  by  the  undesired 
signature  vectors).  In  the  sequel,  we  will  mention  the  SNR 
only,  since  it  equals  the  CNR  for  the  eigenimage. 

Equation  (5)  shows  that  the  SNR  of  die  eigenimage  is 
directly  proportional  to; 

1)  The  reciprocal  ^  of  the  noise  strength  (or  SNR  of  the 
original  images); 

2)  The  dissimilarity  ||d||  sin(0)  between  the  desired  feature 
and  the  interfering  features. 

Therefore,  there  are  two  methods  for  improving  the  SNR 
of  the  eigenimage: 

1)  Suppressing  noise,  i.e.,  improving  the  SNR  of  the  wig- 
inal  images: 

2)  Optimizing  the  MRI  protocols  and  pulse  sequence  pa¬ 
rameters,  i.e.,  maximizing  ||d||sin(S). 

Improving  the  SNR  of  the  original  images  was  the  subject 
of  [4).  In  this  paper,  we  optimize  MRI  protocols  and  pulse 
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rKOTOCOLS  AND  Pi  J  cp  'cv/^c  n*  r» 

ilse  seolence  parameters  eor  eioemmaoe  filter, no 


fcrsL'’a„“r‘"!.'°  between 

the  desired  and  mtertenng  features, 

8  ,>fPre,  „n,s  mi  Op„„r.a„.,„ 

Opnmizalion  of  MR,  p,o,„cols  a„8  ^ 

eters  has  ten  an  area  of  panicular  ,„,.„s,  dpnne  iL  to,' 
lew  years  |p]-(|8|.  In  derivini  opnmal  MRt  T 

pulse  sequence  parameiers,  several  liEureMf-rSeriMFOM  1 

iiLiSf'y  .S'  ■:  ’srofT 

|.3«t  ,3,  s,8„a,  8rad,eS  tpar^Lers::,,  n4,'S, 

.ccaracy  of  fhe  calculated  tissue  parameters  ('iS  :  (5,  se 
*e  synthesized  tmage  (I6|,  (17);  and  ,6)  diaLst  c  tes^nsT 
p.„m,z,n8  SNR"  maximizes  the  SNR  o/a  d”sUd  .issue 
ui  the  acquired  image.  "Optimizing  CNR",  usually  refemed 
to  as  the  contras/  method,  maximizes  the  difference  between 

tSSsSrpef  .o  “o 

rSsS  onhe  S'ftii'nt”  ™ximizes 

the  SCTS.ttvto.  of  the  image  gray  levels  to  subtle  .V/ffl  TI 
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its^t  to  the  calculated  values  for  Nim  Tl  and  T2 

»d«  synthesized  image.  Finally,  “optimizing  tlK^alL" 

S  C^Lst  SNR  Td 

spje.  t.ontrast,  SNR.  and  parameter  sensitivity  are  used  for 

Several  methods  have  been  utilized  for  solving  the  onti 
"tanon  prablems  defined  by  the  above  FOMsToF^  Si 
ammonly.  SNR  [5].  CNR  (9).  noise  level  [,7|  ”  con^ 
of  parameter  sensitivity  rni  are 

-'-b  typical  p.t^:^lr.tSr  nSr^reT 
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*  'Uunerical  methnw  ^  computer  program  implementing 
st^evl’  algorithm  and  a 

All  thT  IS  used  to  find  its  solution  [12], 
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FOMs  (OFs)'a?J  differed’ F™  ^^^uences  for  different 
which  optimizes  the  dip  Pulse  sequence 

tha,  maximT  “  CNR  f,7^  '7''^"'"'  'be  one 

ypcclficiiy  may  mquite  a  dLt^e  m  CnTS  "  '"r““ 
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C  Optimization  Problem 

our°FoT(o7l"noril7S'’'r''‘‘’“ 

normalized  SNR  (NSNR  -  SNR  a-  a  a"*  ^  "”0^0  but  the 
of  .he  acquisition  S  'Suam  root 

generated  by  a  linear  combinai  °'"7'**^  (eigenimage), 

scene  sequence  TnuiZ  «W 
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for  the  accuracy  of  calculated  Previously  defined 

synihesized  uSJ  "oise  in  die 

Sion  recovery  (IR)-  and  fai  tnver- 
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element  of  a  signature  vector)  rS  ^  pk  ® 
tissue  parameters  (N(H),  Tl  and  T2i  f  ^  **  mtnnsic 
literature  such  as  1201  12  1  nr  .  m  the  MRI 

and  r2-weiehted  im  ’  ^  esUmated  from  the  standard  Tl- 

S='£5=i”~«: 

For  the  brain  Nfifi  and  ti  •  bram  images. 

MRI  lileratu,;  <*0™  u«i 

the  standanl 

_  J  *  brsin  unfits.  Note  thaf  M/'w\ 

X  "?  “  “  ■‘'■“ity  comr^S  ™ 

The  objective  fimetion,  along  with  a  set  nf  hi. 
instrumental  constraints  on  J 
defines  a  multidimensional  noLe^oS^  ^ters. 

[22]  programmed  by  Saigal  [23], 

Fixed-point  theorems  are  routinely  used  to  establish  rh. 

. . 
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TABLE  I 

I.NTRINCSIC  ISSLE  PARAMETERS  FOR  THE  QC  PHANTOM  AND  THE  .VORMAL  HL'MAN  BrAIN' 


QC  Phantom 

Human  Brain 

- - — 

A 

B 

C 

WM 

GM 

CSF 

N(Hl 

0.788 

0.769 

0.786 

0.770 

0.860 

i.ooiT 

Tl 

162.0 

1,340.0 

300.0 

515.0 

871.0 

1.900  0 

t: 

■’2.0 

208,0 

131,0 

56,6 

73.3 1 

268  V 

TABLE  II 

Typical  Imaglng  Parameters  for  MRI  Brain  Stldes 


FOV 

ST 

MS 

NEX 

TNAI 

.MNPS 

20  cm  1 

5  mm 

192  X  256 

1 

4  or  5 

2  ^ 

Pulse  Sequence  Name 

Number  of  Images  it  Generates 

Multiple  Spin-Echo  (MSE) 
Spin  Echo  (SE) 
Inversion-Recovery  (IR) 
Gradient-Echo  (GE) 


in  optimization).  Their  use  in  optimization  is  recent  (see  [24]), 
and  their  techniques  are  well  suited  for  problems  involving 
inequalities.  In  such  a  case,  the  methods  of  unconstrained 
optimization  fail,  even  when  the  underlying,  functions  are 
smooth.  Generally,  strategies  like  “active  set”  or  “sequential 
quadratic  programming  (SQP)”  are  adapted  in  optimization 
methods  to  overcome  this  problem. 

The  fixed-point  formulation  we  use  in  this  paper  takes  care 
of  the  upper  and  lower  bounds  on  the  variables  implicitly,  and 
these  constraints  are  easily  incorporated  into  the  formulation. 
This  reduces  the  size  of  the  problem  to  be  solved,  i.e.,  the 
number  of  unknowns  to  be  found.  It  uses  a  methodology  based 
on  triangulating  the  space  (as  in  finite  element  methods)  and 
following  a  path  of  solutions  to  a  minimum.  This  method  is 
ideally  suited  for  nondifferentiable  optimization,  and  thus  can 
handle  inequality  and  equality  constraints  without  explicitly 
considering  the  active  sets.  It  is  robust  in  passing  the  local 
minima  and  giving  the  global  minimum,  except  when  there 
exists  a  large  peak  between  the  local  and  global  minima. 
Its  rate  of  convergence  is  quadratic  and  is  thus  fast.  The 
optimization  technique  is  demonstrated  through  its  application 
to  the  QC  phantom  and  the  human  brain. 

In  Section  2,  we  describe  problem  formulation  and  the 
approach  to  find  a  solution.  In  Section  3,  we  present  theoretical 
and  experimental  results  for  the  QC  phantom  and  the  human 
brain.  In  Section  4,  we  summarize  the  research  performed 
and  give  conclusions,  and  in  the  Appendix,  we  explain  some 
mathematical  details  for  the  optimization  technique.  This  paper 
is  an  extension  of  the  poster  presented  at  the  IEEE  Medical 
Imaging  Conference  in  conjunction  with  the  Nuclear  Science 
Symposium  [25]. 


II.  Methods 
A.  Problem  Formulation 


be  independent.  Hence,  we  include  the  acquisition  time  (AT) 
in  the  formulation  of  the  optimization  problem  by  defining  the 
following  NSNR. 


NSNR  = 


SNR^  ||d|| 
\/7a  (tn/AT 


sin(S) 


(6) 


Depending  on  the  application,  desired  and  interfering  features 
are  defined,  and  a  single  or  multiple  NSNRs  are  considered. 
When  using  the  eigenimage  filter  to  segment  an  abnormality 
from  normal  tissues,  the  NSNR  of  the  eigenimage,  generated 
by  taking  the  abnormality  as  the  desired  feature  and  normal 
tissues  as  the  interfering  features,  is  the  objective  function. 
WTien  using  the  eigenimage  filter  in  3-D  visualization,  the 
objective  function  is  the  minimum  of  the  NSNRs  of  several 
eigenimages,  each  generated  by  taking  one  of  the  tissues  as 
the  desired  feature  and  the  other  tissues  as  interfering  features. 

As  explained  earlier,  we  solve  the  optimization  problem 
by  translating  it  to  a  fixed  point  problem.  This  translation 
requires  the  gradient  of  the  objective  function,  as  we  will 
explain  in  Section  2.2.  Calculation  of  the  gradient  will  be  much 
easier  if  we  manipulate  and  analytically  simplify  the  objective 
function.  For  this  reason,  we  use  matrix  notation  and  define 
the  following  objective  function  f'. 


l|d|P 


/  =  -((tNSNR)^  =  -ir^sin\9)  =  - 


where 


M 

AT 


d^do 

AT 


(7) 


d,  =  d  -  dp  =  [/  -  U{U'^U)-^U^]d  (8) 

U  =  [ui,U2,...,Um'] 

TU\i  U21  •••  Urn'll 

\Ui2  U22  •••  C7m'2 


The  goal  of  the  optimization  is  to  maximize  the  SNR  of  the 
eigenimage  given  in  (5).  Averaging  several  acquired  images 
(or  free  induction  decay  (FID)  signals)  improves  the  SNR  by 
a  factor  of  the  square  root  of  the  number  of  averaged  images 
lor  FfD  sienaNi.  since  the  additive  noise  fields  are  assumed  to 


Un  U2n 

'  Minimization  of  /  is  equivalent  to  maximization  of  NSNR.  for  a  given 
noise  level  cr.  Without  loss  of  generality,  we  may  also  consider  cr  =  1.  since 
in  deriving  MRI  signal  models,  identical  constant  transmitter  and  .'rceoer 
2.iins  are  assumed  for  all  of  the  pulse  sequences. 
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^  TABLE  111 

Rfai  Seolence  Par.a.meters  for  Segmentation  of  One  Ffiti  rf  vnd  nr- 

Realltant  Eigenvalles  for  Eolr  Combinations  of  VIR I 

-] - =r= - — , - - -  ^‘^enf  Sfolences  of  ^he  OC  Phantom 

TE\t-F  TR  ,.r  - - - - - ^ ^  n.A.M'jM 

-"'-*■0-“'  546.18  I _ 

44.06  :. 430.60 

_ _ ~-T64  878.68 


41.40 

20.00 

37.58 

732.6 

490.9 

786.2 

10.0 

10.0 

10.0 

1.000 

2.498 

2.000 

1<E 

30,52 

19.00 

TR-,.r 

399.4 

2.500.0 

TE,r 

12.0 

20,0 

TFjr 
182.2 
879  7 

23.52 

ISE 

39.17 

20.00 

307.9 

TR 

607.0 

480.0' 

12.0 

.  TEcf. 

5.0 

5.0 

154.0 

51.09 

58.32 

38.79 

607.0 

5.0 

30.71 

TR.'r 
500.0  ' 
2.500.0  ' 
500.0  ' 
T  Rr:  r 

327,5  ' 
2.000,0  ~ 
383.2  ' 


227.00  ' 
41.67  ' 
132.30  ' 
-V5  4  ' 

150.20 
64.05  ~ 
122,00 


35 

143, J) 
"8.13 
Sr 

21.93  ' 
1 15,20  ' 
42.45  ' 

40.23 
153.80  ~ 
_ ^.10  ~ 


T^e  n  X  1  vectors  d,  and  d„  are  the  projection  of  d  onto 
the  undesired  subspace  (the  subspace  defined  by  the  undesired 
signamre  vectors,  ue.,  the  columns  of  U)  and  its  orthogonal 
complement  respectively,  /  is  the  n  x  n  identity  matrix 
and  {u. ,  1  ^  i  <  m  }  are  the  interfering  signature  vectors 
Using  the  relationship  between  the  MRI  signal  from  a  tissue 

(-V(//),  n,  and  72),  and  MRI  protocols  and  pulse  sequence 
parameters  (TE,  TI,  TR,  and  flip  angle),  /  in  (7)  may  be  ex¬ 
pressed  as  a  function  of  the  pulse  sequence  parameters,  i.e.,  the 
objective  function  is  /(x)  where  x  contains  the  pulse  sequence 
p^ameters.  Here,  we  suffice  with  the  above  explanation  of  the 
Objective  function.  Since  each  element  of  d  and  U  is  a  complex 
function  of  the  above-mentioned  parameters  (see  (45)-(46)  for 
thTJa^r  representation  of  /(x)  does  not  fit  into 

Constraints:  There  are  two  sets  of  MRI  parameters'  (I) 
imaging  parameters  (matrix  size,  field  of  view  (FOV)  and 
^ce  t  iclcness);  and  (2)  pulse  sequence  parameters  (TE,  TI 
TR,  and  flip  angle).  The  optimization  procedure  should  include’ 
a  number  of  constraints  on  these  parameters.  Some  of  these 
constraints  are  specified  by  the  diagnostician  and  ate  task 
dependent,  e.g.,  the  minimum  required  resolution,  the  total 
Illume  to  be  covered  in  the  examination,  and  the  maximum 
imaging  time.  Others  are  instrumental  limitations,  e  g  in  a 

numll  K  K  i  ^  r^ertain  positive 

also  fJ  r  imaging  equipment.  TE  should 

tisfy  these  constraints:  (1)  We  fix  some  of  the  parameters  ! 

TdI;  ^ parameters,  we  consider  lower  and 
upper  bounds  (lb,  <  x,  <  ubi,  i  =  l. . .  p  where  x  is  ’ 

d  ^  and  ub,  are  lower  and  upper  bounds,  respectively),  and 

linear  constraints  (,,(x)  <  0.  ,  =  i . ^n.  where  m  is  the 

r  o  required  relationships  among  the  parameters)-.  The 


optimization  problem  can  then  be  formulated  as 
Minimize  /(x) 

subject  to  ^  <  ubi.  i  z=  j . 


jForihe  , menial,™, MHOHN  gnen  Tbo^e.  .e  .u, Hid 


e  previously  defined  in  (7)-(9),  and  x  = 

e  1  i  t,,  i  I, TR,  nip-angle]  for  example. 

e  lisS'in  TaSfiT  parameters 

hsted  in  Table  II.  an  image  sequence  may  consist  of;  ( 1 )  Four 

^  f  0"®  SE  image;  (3) 

c  and  one  IR  imaged  For  each  combination,  a  constrained 

r  optimization  problem  as  defined  in  (10)  is  solved  (using  the 
.  fixed  point  approach  discussed  next)  to  find  the  corresponding 
opnmum  MRI  pulse  sequence  parameters.  Each  minimum 
coiresfronds  to  a  candidate  set  of  parameters  for  the  final 
optimal  MRI  protocol.  A  comparison  of  the  resulting  NSNR 
(when  using  these  candidate  protocols)  gives  the  protocols 
and  pulse  sequence  parameters  that  yield  the  smallest  (global) 
mimmum  of  /  (largest  (global)  maximum  of  NSNR  of  the 
eigenimage). 

B.  Translation  of  the  New  Optimization 
Problem  into  a  Fixed-Point  Problem 

A  fixed-point  problem  is  defined  as  follows.  Given  a  point- 
to-set  mapping  T  from  7^'•  (the  n-dimensional  EucMean 
vector  space)  into  nonempty  subsets  of  72",  find  a  point  x 
such  that  X  6  r(x),  i.e.,  x  a  fixed  point  of  F  [22],  [23]  In 
particular,  if  T  maps  points  to  points,  we  have  x  =  r(x) 
fa  the  following,  we  show  how  the  constrained  optimization 
pr^lem  in  (10)  is  translated  into  a  fixed-point  problem. 

Claim:  Define  (here  we  define  row  vectors) 

fh  if  Xi  <  Ibi 

•Ti  if  Ibi  <  Xi,  <  ub,  (II) 

«<  if  X,  >  ub, 

m  The  number  of  images  in  the  sequence  is  limned  bv  the  total  acquMt.on 
.  .  ^-“"ventional  pulse  sequences  'that  ..re  as..,L,Kj 

\|KI  n  \  IT;*  ■  In.M'  >  -s  .  ..  . 
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TABLE  IV 

Optimum  Pi  l.se  Sequente  PAR.^.METERS.  for  Segmentation  of  all  Three  Featl-res.  and  NSs  of  the 
Resllia.nt  Eigenvalues  for  Three  Combin.ations  of  MRI  Scene  Sequences  of  the  QC  Phantom 


PS 

DF 

TE\,-f 

TR„.r 

TR.f 

-V5a 

XSb 

4 

A 

3 1 .45 

10.0 

1.500 

79.09 

98.04 

MSE 

B 

10.0 

1.498 

72.23 

109.70 

68.29 

!  SE 

C 

35.16 

10.0 

1.500 

78.54 

93.48 

81  P 

PS 

DF 

TEm.f 

TR-:.r 

TI;r 

TR,r 

VS , 

\Sb 

•VSr^  ■ 

4 

,A 

32.94 

500.0 

815.3 

2.000 

61.76 

60.74 

53.16 

MSE 

B 

396. 1 

635.5 

2.000 

58.59 

90.12 

51.91 

I  IR 

C 

22.13 

776.7 

2.122 

60.63 

72.99 

54  41 

PS 

DF 

TEm.f 

TR,,<f 

TEr.F 

f>GE 

TRcf 

.VS  4 

VSh 

•VSr 

4 

A 

31.17 

500 

5.0 

115.70 

2.000 

70.36 

93. .36 

71.94 

MSE 

B 

500 

5.0 

29.36 

500 

102.50 

133.60 

96.85 

1  IR 

C 

35.33 

500 

5.0 

39.38 

500 

112.30 

114.2(1 

XJ  =  max{0.  Aj} 

(12) 

A“  =  min{0.  Aj} 

(13) 

X  =  [xi.X2 . Xpj 

(14) 

x"^  =  [xj'.x^ . Xp  ] 

(15) 

11 

(16) 

=  . 

(17) 

g(-)  =  [5i(-)-52(-) . 5m(-)]  (18) 

r  r\  ^  ^  <*\<»  T 


V/(-)  = 

!<■> 

9X2^'^’" 

"dxp^'\ 

dx,  l  I 

Ift(') 

d9i 

dxp 

(•)■ 

S21(.) 
ai2 1  > 

.  .  .  ^02 

(A 

Vg(-)  = 

dx,  V  ) 

dxp 

.  .  . 

(A 

Laii  w 

dX2  '  ^ 

dxp 

\  }  ^ 

then  solving  the  following  fixed-point  problem 

rx^-[V/(x+)  +  A+Vg(x+)]  =x 
lA^  +  g(x+)  =A 


yields  a  solution  to  (10). 

Proof:  We  show  that  a  solution  to  (21)  is  a  Karush-Kuhn- 
Tucker  (KKT)  point  [26]  (Section  6. 1  in  the  Appendix  shows 
that  for  the  case  of  convex  /  and  gj%,  the  resulting  solution 
yields  the  minimum  of  /(x)).  Rewrite  (10)  as 


t{x  -  ub)^  =  0.  ub  =  [ubi . ubp]  (27) 

${-x.  -  Ib)^  =  0,  Ib  =  [Ibi ....Jbp]  v28) 

Vf{x)  +  XVg(x)  +  ifl-$I  =  0  (29) 


where  /  is  the  p  x  p  identity  matrix.  Assume  (i^,  A  )  is  a 
solution  to  the  fixed-point  problem  in  (21).  From  (11),  (12), 

(15),  and  (17)  it  follows  that  X  satisfies  (25)  and  A  satisfies 

(23) .  Since  g(x  )  =  A  —  A  =  A  ,  from  (13)  it  is  clear  that 

(24)  is  satisfied,  also  A  =  A  (X  )^  =  0,  i.e.,  (26)  is 

satisfied.  From  (21)  V/(x''’)-l-A  Vg(X''’)  = 
and  for  each  x;  at  least  one  of  the  following  three  cases  should 
hold;  (1)  xf  =  Xj,  i.e.,  Ibi  <  Xi  <  ubi;  (2)  xf  =  Ibi,  i.e., 
Xi  <  Ibi;  or  (3)  x*  =  ubi,  i-e..  Xj  >  ubi.  For  (1)  x“  =  0  and 
=  ii  —  0;  for  (2)  -x“  >  0,  =  0,  and  =  -x~  >  0; 

and  for  (3)  -x~  <  0,  =  x~  >  0,  and  =  0.  Therefore, 

(23),  (27),  (28),  and  (29)  are  also  satisfied.  Q.E.D. 

Simple  Example:  In  this  section,  we  give  a  simple  example 
to  illustrate  the  method  of  translating  an  optimization  problem 
into  a  fixed  point  problem. 

Consider  the  following  single  variable  example. 

Minimize  /(i)  =  x^  -  3x 

subject  to  2  <  X  <  3  (30) 


Minimize  /(x) 

{9j(x)<0,  j  = 

Xi<ubi,  i  =  (22) 

-Xi<-lbi,  i  = 

and  define  the  Lagrange  multipliers  A  =  [Ai,...,Am].  V’  = 

[ijji . tbp],  and  <f>  =  —  </>p]  for  the  inequalities  in  (22), 

respectively.  Then  a  KKT  point  should  satisfy  the  following 
conditions  [26]. 

Aj.i/ii.flii  >  0,  2  =  1 . m,z  =  l,...,p  (23) 

<  0,  2  =  1 . m  (24) 

lb,  <  X,  <  ub,.  i  =  1 . p  (25) 


According  to  (11)  we  define  x"*". 

[2  if  x<2 

x+  =  {x  if  2<x<3  (31) 

[3  if  X  >  3 

Since  there  is  no  gj{x),  there  is  no  need  for  \j.  The  corre¬ 
sponding  fixed-point  problem  according  to  (21)  will  be 

x"*"  —  V/(x'*')  =  X  (32) 

But  V /(x)  =  2x  -  3  and  using  (31)  we  find  that 

1  if  X  <  2 

3-x  if  2<x<3  (33) 

0  if  .r  >  3 


X  I  =  I) 


(26) 


-  V/(x*)  = 
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Which  equals  t  if  and  only  if  x  =  i.  Thus  the  solution  to  the 
hxed-(wint  problem,  i.e..  the  equality  in  (32),  is  x  =  1  and 
ote  that  the  solution  to  the  optimization  problem  is 

ni.r'  ""r/  show  that  this  solution  is  correct  by 

plotting  /(x)  =  -  3j  for  2  <  X  <  3. 

Calculation  of  Gradients:  Definition ‘of  the  fixed-point 

and  the  constraints.  The  constraints  are  simple  linear  functions 
and  their  gradients  are  easily  calculated.  The  objective  function 
IS  nonlmear  and  complex,  and  calculation  of  its  gradient 
although  straightforward,  is  somewhat  tedious.  To  proceed 
with  this  gradient  calculation,  define 


ni.  Results 


then 


Q  =  I  -UiU^U)-^U^ 


_  d^g^Qd 

AT 


Equauon  (35)  shows  that  /  is  a  function  of  Q,  d.  and  AT.  each 
of  which  m  turn  IS  a  function  of  the  pulse  sequence  parameters 
X.  Hence,  using  the  chain  rule  we  have 

Vx/  =  ^q/VuQV^U  +  Vd/Vxd  -b  Vat/V*AT  (36) 

where  subscripts  are  variables  with  respect  to  which  gradients 
are  calcu^lated.  Some  of  these  gradients,  e.g.,  and  V^d 
are  found  analytically,  others,  e.g.,  Vg/  and  V^  Q,  are  found 
by  penmrbanon  method,  as  explained  in  Section  6.2  in  the 
Appendix.  The  results  are  summarized  below. 


Vq/ 

c  =  [1.1 


-;^[diag(gd)cd^] 


VcQAU  =  _  U{U'^U)~'^AU'^ 

+  U(U'^U)-^[AU^U  -h  U^AU]{U'^U)-^U'^ 


A.  Optimal  Pulse  Sequence  Parameters 

The  optirnization  procedure  described  in  the  Section  was 
applied  to  MR!  scene  sequences  of  the  QC  phantom  and 
the  normal  human  brain.  Intrinsic  tissue  parameters  for  the 
Q  phantom  were  estimated  using  the  standard  Tl-  and  T2 
weighted  sequences.  For  the  normal  brain  tissues.  N(H)  and 
ri  values  were  obtained  from  the  MRI  literature  [20].  [21] 
Md  T2  values  were  estimated  from  the  standard  r2-weighted 
brain  images.  The  tissue  parameters  for  the  QC  phantom  and 
(34)  normal  human  brain  are  listed  in  Table  I. 

Four  combinations  of  the  basic  MRI  pulse  protocols  were 
considered  for  the  opumization.  They  were:  (1)  a  MSE  with 
our  echoes;  (2)  a  MSE  with  four  echoes  and  a  SE;  (3)  a  MSE 

a  GE.  Except  for  the  first  case,  the  objective  function 

exwuted  several  umcs  using  different  starting  points  and 

.T  [“  ™  o'  P-l» 

Each  ome  a  local  minimum  of  the  objective  function  (local 
^unum  of  4e  NSNR  of  d»  eigentaogo)  woo  fo„„d. 

^al  mmiiM  were  compared  to  obtain  the  smallest  (global) 
mimmum  of  the  objective  function. 

Example:  As  an  example,  we  explain  details  of  the  op- 
P»;^«iure  for  combination  (3).  First,  the  partial 
denvauves  of  the  signal  models,  with  respect  to  the  pulse 
sequence  parameters,  were  analytically  calculated.  Second,  a 
computer  program  numerically  calculated  (34)-(44)  for  each 
^of  .isso.  pul»  soq^ooo  ™s  ^ 

used  the  followmg  signal  models  and  their  partial  derivaSes. 


(37) 

(38) 


S\asy:  =  N{H) 


V,U  = 


dC’ii  dV,  I 

9xi  axj 

^  dU:L 

I  dx2 


Vd/ 


V^d 


sv„„,  dl\^ 

I  dxj 

•■Mx  M. 

Si I  9x2 
ddi  dd-2 
Sxi  dx2 


syii 

<7*p 


I  Sdo 

‘•9x1 


9d„ 

9x2 


9i., 

dd, 

3^7 


9d„ 

9x- 


V^r/  =  ^y^  =  -±f 

AT^  AT-' 

VxAT=  [0,...,0.1,0,...,0, 1] 


(39) 

(40) 

(41) 

(42) 

(43) 

(44) 


1  - 


2  ^(-l)'e— 

.  1=1 


— rjtsB 


1  <  1  <  4 
(45) 


•5ir  =  Af(/r)^l  -  -I-  2e  — 

-TEm 
9  T2 


X  e' 


1 

(46) 


MSE 

oTEmse  '  ' 


1- 


I _ 1 


L  /=i 


corresponds  to  the  repetition  time  (TR) 
com  I  participating  pulse  sequences.  Again,  due  to  the 

mplexity  of  the  functions  we  suffice  with  the  above  implicit 
P  esentation  of  the  gradients.  Funher  details  for  one  of  the 

^  ll(?nr  p.  .  ... 


T2 


+N{H) 


.  .n  u..- 


1  <  /■  <  4 
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^■^NISE  _  7^(_n'g 

^.MSE  Tl 


1  <  i  <  4 


(48) 


dSiR 

c^TEir 


y{H) 


(49) 


^  =  (50, 

1  -TRfR  -TEfft 

(51) 

where  represents  the  signal  in  the  ith  image  of  a 

four  multiple  spin-echo  sequence  with  parameters  TEmse  and 
TRmse.  and  5ir  represents  the  signal  in  an  inversion  recovery 
image  with  parameters  TEir,  TIir,  and  TRir. 

Third,  a  second  computer  program  used  outputs  of  the  first 
program  to  implement  (ll>-(20),  with: 


lb  =  [/61,  (62,  (63,  (64,  (65] 

=  [19.500A:,12,100,500/],A:,/  =  0,...,4  (52) 

ub  =  [ubi ,  ub2 ,  ubj ,  ub^ ,  ufts] 

=  [100. 500(A:  +  1),  20, 500((  -H  0.9),  500(1  -I- 1)], 
k.l  =  Q . 4  (53) 

X  =  [xi,X2,X3,a:4,a;5] 

=  [TEmse- TRmseiTEiri  TIir,  TRir]  (54) 

gi(x)  =  4xi  -  X2  4-  50  (55) 

52(x)  =  2:3  -I- X4  -  X5  -t- 50  (56) 


where  the  elements  of  lb,  ub,  and  x  are  in  milliseconds 
(msec),  and  the  constraints  <)i(x)  <  0  and  g2{x)  <  0  are  simi¬ 
lar  to  those  explained  in  Section  2.1.1.  Equation  (56)  states  that 
TRmse  should  be  greater  than  or  equal  to  four  times  TEmse. 
which  is  the  acquisition  time  for  the  fourth  MSE  image,  plus 
50  msec,  which  is  needed  for  the  sample  excitation,  signal 
acquisition,  and  software  management.  Equation  (56)  states 
that  TRir  should  be  greater  than  or  equal  to  TEir-I-  TIir  -I-  50 
msec.  Again  about  50  msec  is  needed  for  the  sample  excitation, 
signal  acquisition,  and  software  management. 

Finally,  the  main  subroutine  of  the  fixed  point  program  [23] 
was  modified  so  that  it  used  the  outputs  of  the  above  programs 
and  found  a  solution  to  (21).  For  each  of  the  QC  phantom 
and  the  human  brain,  and  for  each  of  the  desired  features, 
the  fixed  point  program  was  run  25  times  (k.l  =  0 . 4). 

.1  L-.-  ,  '.,  '111,  ..Ptirnii,.!  ncint,  A  comparison  of 


5TRir 


-(2TR1R-TE1: 


Fig.  2.  Original  and  eigenimages  of  the  QC  phantom,  using  the  optimal  MRI 
pulse  sequence  parameters  for  a  sequence  of  4  MSE  and  1  SE  images,  a-c:  4 
MSE  (TE/TR  =  31-124/500  msec)  and  1  SE  (TE/TR  =  10/1,300  msec)  ima^ 
f-h:  Resulting  eigenimages  for  vials  A,  B,  and  C,  respectively.  Orientation 
of  vials  .4,  B,  and  C  in  Hgs.  2-4  are:  A  is  the  circular  cross  section  at  lOHX) 
o’clock  position,  B  is  the  lengthwise  cross  sections  at  12K)0  and  9:00  o'clock 
positions,  and  C  is  the  lengthwise  cross  sections  at  3:00. 


the  resulting  values  of  the  objective  function  at  these  locally 
optimum  points  yielded  the  globally  optimal  parameters  for 
the  corresponding  pulse  sequences. 

It  should  be  noted  that  practical  optimization  procedures 
require  the  functions  to  be  convex  to  guarantee  that  the 
solution  found  is  correct.  When  this  assumption  does  not 
hold,  one  is  generally  guaranteed  only  a  KKT  point,  which 
may  even  be  a  saddle-point  This  is  no  different  for  our 
formulation,  and  its  convergence  to  solution  can  be  proved 
only  for  the  convex  case.  However,  since  this  formulation  can 
treat  the  bounds  very  efficiently,  we  were  able  to  effectively 
search  for  the  global  minimizer  by  partitioning  the  space 
into  sufficiently  small  “cubical”  regions,  and  searching  for  a 
minimizer  in  each  region,  and  then  choosing  die  best  solution 
found.  Although  there  is  no  guarantee  that  die  final  solution 
wiU  be  the  global  minimizer,  our  ctxnputation  experience 
appears  to  support  the  sufficiency  of  its  accuracy  for  all 
practical  purposes. 

Tables  in  and  IV  list  the  optimum  pulse  sequence  param¬ 
eters  for  segmenting  one  of  die  features  using  each  of  the 
combinations  (l)-{4),  ftw  die  QC  phantom  and  the  human 
brain,  respectively.  They  also  list  the  <tNSNR  of  the  eigenim¬ 
ages  generated  for  each  feature  (tissue).  Tables  IV  and  VI  list 
several  optimum  pulse  sequence  parameters  for  segmenting  ail 
of  the  three  features  in  the  scene.  These  parameters  maximize 
the  minimum  NSNR  of  the  three  eigenimages.  They  are 
appropriate  for  the  application  of  the  eigenimage  filter  to 
3-D  visualization,  since  for  this  application  all  of  the  three 
eigenimages  are  used  and  thus  the  minimum  of  the  three 
NSNRn  determines  the  quality  of  the  resulting  3-D  ima^e. 


table  V 


PS 

4 

MSE 

PS 

4 

DF 

W 

G 

C 

DF 

TExi.g. 

43.88 

51.25 

84.37 

TEii.r 

X  TR~j^p 

986. 6C 
1.507.50 
3.085.69 
TR' 

TE,r 

->NS  OF  .MRI  SCE 

TE<r 

500 

500 

1.139 

TIir 

631.2 

555.3 
360.8 

N'E  SeOL  ENCES  0 

F  THE  Hl.MA.S  B 

9.68 

9.00 

4.43 

RAIN 

I  -V^B 

I  8.31 

8.90 
5.92 

piSEi; 

49.98 

68.23 

96.28 

MSE 

1  SE 

PS 

G 

C 

DF 

32.56 

52.55 

74.01 

TEm.p 

2.332 

2.499 

2.499 

TR\f<;fr 

10.0 

10.0 

10.0 

TFf  r, 

.V54 

46.33 

46.23 

30.19 

-V5fl 

36.62 

36.74 

30.43 

■VSr- 

161.70 

164.40 

MSE 

1  IR 

PS 

W 

G 

C 

DF 

31.34 

19.00 

57.64 

TFs  r.-r- 

1.000 

1.059 

2.500 

12.0 

12.0 

12.0 

TRir 

2.336 

2,500 

1.000 

_  .V5, 

107.70 

79.36 

■VSr 

64.48 

105.70 

■VSr- 

95.86 

95.70 

4 

MSE 

1  GE 

W 

G 

C 

54.84 

54.60 

73.03 

TR.v/sr 

2,207 

2,424 

2J00( 

TEr.r 

5.0 

5.0 

5.0 

104.64 

104.86 

60.98 

TRnr 

456.1 

399.6 

82.0 

.V54 

48.05 

47.37 

33.35 

54.77 

— :^'^g  .. 

37.97 
38.38 
32.21 

198.60 

A'5r- 

165.30 

171.20 

183.70 

The  following  notation  is  used  in  Tables  HI- VI; 

TS  niiIcA 


:  pulse  sequence 
•■  desired  feature 

TEmse  :  echo  time  (TE).  in  msec,  ftw  the 

MSE  pulse  sequence 

'^R^tSE  :  repetition  time  (TR).  in  msec,  for  the 

MSE  pulse  sequence 

•'  (TE).  in  msec,  for  the  SE 

pulse  sequence 

™SE  :  repetition  time  (TR).  in  msec,  for  the 

SE  pulse  sequence 

f™e  (TC).  in  msec,  for  the  IR 
pulse  sequence 

Tim  ;  inversion  time  (TI).  in  msec,  for  the 

IR  pulse  sequence 

™m  :  repetition  time  (TR).  in  msec,  for 

ill®  IR  pulse  sequence 

fime  (TE),  in  msec,  for  the  GE 
pulse  sequence 

“ge  :  pulse  flip  angle  (a),  in  degrees,  for 

the  GE  pulse  sequence 

TRge  :  repetition  time  (TR),  in  msec,  for 

GE  pulse  sequence 

-!•  C  :  different  materials  (features)  in  the 
QC  phantom 
(see  cafXioas  of  Fig.  2) 

WM,  GM,  CSF  :  brain  tissues:  white  matter,  gray 
^  ™tter.  cerebrospinal  fluid 

1  '■  crNSNR  of  the  feature  g  in  the 

eigenimage 

The  optimization  results  listed  in  Tables  DI-VI  suggest  that 
of  .ito  .  SB,  nt.  or  GE  i™*.  „ 

Of  the  sigmficantly  improves  the  quality  (NSNR) 

least  tl^e  intrinsic  tissue  parameters  influence  the  MRI 
^al,  but  only  two  parameters  are  adjustable  in  a  MSE 
^  sequence,  at  least  four 

reedom  for  generating  high  NSNR  eigenimaaes.  Tables 


TRmse 

TEse 

TRse 

TEir 

TIir 

TRir 

TEge 

Qge 

TRge 

*4,  B,  C 

WM,  GM,  CSF 


poise  sequence  pi 

4  MSE  (^  =  22-88/500  2^r22^“^'! 

‘•"ages.  (fHj)  eigeoimages  for  vLbTfl.  ^ 

wiA  opumal  pulse  st^ucnce  parameters,  should  generate  phan- 
tom  eje^ages  with  laigest  NSNRs.  Similarly,  a  sequei^f 
four  MSE  »■  K.  wifl.  opa„oj 

siwuld  genenie  brain  eigenimage  wid,  largest  NSNKs. 

C,  Experiments 

performed  to  evaluate  the  math¬ 
ematical  pactions.  The  QC  phantom  was  imaged  using 
the  optima  pulse  sequence  parameters  for  segiSng  all  of 
he  three  features  in  the  scene  (i.e.,  the  one  L  maLize 
the  rninimum  NSNR  for  the  three  eigenimages).  for  each 
P  Ise  sequence  combination  given  in  Table  IV.  It  was  also 
imaged  using  the  conventional  pulse  sequence  narame-e- 
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TABLE  VI 

Optimlm  Pllse  Seqlence  Parameters,  for  Seg.ment.ation  of  All  Three  Featlres.  and  NSs  of  the 
Resllta.nt  Eicentmages  for  Three  Combinations  of  MRl  Scene  Sequences  of  the  Human  Brain 


Fig.  4.  Origiiial  and  eigenimages  of  the  QC  phantom,- osiog  the  qiCDnl  I4RI 
pulse  sequence  parameters  for  a  sequence  of  four  MSE  and  one  nnagea. 
(aHe)  four  MSE  (TE/TR  =  35-14(V2,500  msec)  and  one  GE  (TEfTR  ■  S/300 
msec,  a  =  40  deg)  images.  (f)-(h)  Resulting  eigenimages  for  viab  A.  B.  and 
C,  respectively. 


Rg.  S.  Original  and  eigenimages  of  the  QC  phantom,  using  the  conventioiial 
MRl  pulse  sequence  and  parameters.  (a)-(e)  4  MSE  (lE/TR  s  2S  -  100/2,300 
mieciand  1  SE  (TE/TR  «  20/300  msec)  images.  (IHh)  Resulting  eigenimages 
for  vials  A,  B,  and  C,  respectively. 


parameters  are  given  in  the  captions  of  Table  Vn.).  For 
each  experiment,  three  eigenimages  were  generated  and  their 
NSNRs  were  estimated.  Original  and  eigenimages  of  the 
phantom  are  shown  in  Figs.  2-5.  The  mathematical 
and  experimental  values  fw  the  NSNRs  of  the  resulting 
eigenimages  are  compared  in  Table  Vn  (The  mathematical 
values  are  obtained  by  dividing  trNSNRs  in  Table  IV  by  an 
estimate  of  a.). 

Nine  human  volunteers  were  imaged  using  both  the  con¬ 
ventional  and  the  optimal  pulse  sequence  parameters  for 
segmenting  all  of  the  three  brain  tissues  (The  MRl  parameters 
are  given  in  the  captions  of  Table  VIII.).  For  each  experiment, 
three  eigenimages  were  generated  and  their  NSNRs  were 
estimated.  Table  VIII  compares  the  resulting  NSNRs  of  the 
nine  volunteers'  eigenimages  using  conventional  MRl  pulse 
sequence  parameters  for  brain  studies  with  those  using  the 

Anrtn-'t!  \tP[  nn’s-i  .  m  .  .v n  i  r)  fs.  Tht*  '"Uio  ('>f  thc 


NSNR  of  the  eigenimage  generated  using  optimal  parameters 
to  that  of  ctMiventitmal  parameters  illustrates  die  imfmjvement 
attained  as  a  result  of  the  optimization.  This  ratio  is  1.26  ± 
0.31  for  white  matter,  2.12  ±  0.64  for  gray  matter,  and 
1.13  ±  0.21  for  cerebroqnnal  fluid  (CSF).  Considering 
the  improvement  foi  the  gray  matter  eigenimage  (since  it 
has  usuaUy  a  smaller  NSNR  than  die  white  matter  and  CSF 
eigenimages),  the  optimal  pulse  sequence  almost  doubles  the 
smallest  NSNR  of  die  brain  eigenimages.  Hus  indicates  that 
using  the  optimal  protocol  and  pulse  sequoice  parameters,  the 
imaging  time  for  ^  brain  eigenimage  filtming  may  be  reduced 
by  75%  if  limited  number  of  slices  (less  dum  10)  is  going  to 
be  imaged.  For  entire  brain  imaging  this  saving  may  be  less 
since  more  than  one  acquisition  is  needed  to  cover  the  entire 
volume.  As  an  example,  the  original  and  eigenimages  of  a 
volunteer’s  brain'*  (volunteer  #  7)  are  shown  in  Figs.  6  and  7. 

■‘Note  that  segmentation  of  the  skull  (or  muscles  and  skin)  is  not  usually 
■''.I  thiiv  in  thi-.  paper'  the  purpose  of  the  eigenimage  filtering.  Segmenting  ifi^ 
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Fig.  6.  Original  and  eigenimages  of  a  volunleer's  brain  (volunteer 

^  parameters.  (aHe)  four  MSE 
(TE/TR  =  25  -  1(X)/2.5(X)  msec)  and  one  SE  (TE/TR  =  20/500  msec).  {f)-{h) 
Resultmg  eigenimages  for  white  matter,  gray  matter,  and  CSF,  respectively. 


.  -6  ™  eigenimages  or  tbe  volunteer’s  brain  (volunteer  #7)  us- 

-  polK  sequence  and  paiameten.  (aHe)  four  MSE  (TE/TR 

and  one  IR  .  12/519/2000  msec).  (IHh) 

Resulting  eigenimages  for  white  maner,  gray  matter,  and  (3F,  respec^dy  * 


The  improvement  in  the  quality  of  the  eigenimages,  as  a  result 
of  optimization,  is  clearly  seen  in  Fig.  7,  even  better  than  what 
is  inferred  from  NSNRs  in  Table  VIII.  The  mathematical  and 
experimental  values  for  the  NSNRs  of  the  brain  eigenimages 
are  compared  in  Table  IX. 

Using  the  optimal  pulse  sequence  parameters  for  a  sequence 
of  only  four  MSE  images,  similar  experiments  for  the  (JC 
phantom  and  the  human  brain  were  performed  (Results  are  not 
shown  here  for  conciseness.).  It  was  noticed  that  the  resulting 
eigenimages  in  both  cases  were  superior  to  those  generated 
using  conventional  values  for  the  pulse  sequetKe  parameters. 

All  of  the  experiments  confirmed  that  an  improvement  can 
be  obtained  by  using  the  optimal  parameters.  There  were, 
however,  some  differences  between  the  mathematical  and 
experimental  results.  We  attribute  these  differences  to  the 
model  inaccuracies  and  the  error  in  estimating  NSNRs  and 
tissue  parameters  (e.g.,  as  a  result  of  difficulty  in  selecting  pure 
regions  of  interest)  as  well  as  the  person-to-person  variation 
of  the  tissue  parameters. 


IV.  Summary  and  Discussion 

We  optimized  MRI  protocols  and  pulse  sequence  parameters 
for  the  eigenimage  filtering.  We  formulated  the  maximization 
of  the  NSNR  of  the  eigenimage  as  a  multidimensional  non¬ 
linear  constrained  optimization  problem,  which  we  solved  by 
the  fixed-point  approach.  The  fixed-point  approach  was  used 


(intracranial  tissues)  from  the  muscles  and  skin,  which  are  outsid. 

J  7,  T'’  ^  correspondini 

0  the  skull  (bone),  between  the  two.  We  have  a  region  growing  algorithn 
for  this  segmentation  if  necessary  .  In  Fig.  7.  the  muscles  and  skin  were  no 
removed,  since  they  serve  as  reference  structures  without  interfennc  with  th. 
ueN!re(.1  rojiure. 


TABLE  vn 


COMP^ISON  OF  TOE  NSNRs  OF  TOE  QC  Phanidm  Emenimages. 
Using  Four  Combinations  of  MRI  Pulse  Sequences 


VSNA. 

A'5VRb 

ysxRr 

a 

Mathematical 

9.89 

12.26 

10.06 

Expenmenial 

13.83 

9.34 

9.03 

b 

Mathematical 

7.58 

9.12 

6.80 

Experimental 

6.13 

9.28 

8.10 

c 

Mathematical 

14.04 

1  13.89 

14.28 

Experimental 

20.66 

11.94 

9.60 

d 

Mathematical 

5.35 

7.72 

3.96 

Experimental 

5.12 

9.92 

3.47 

because  of  the  ease  with  which  it  handles  the  upper  and  lower 
bounds. 

We  found  the  mathematical  predictions  for  the  optimal  MRI 
parameters  for  the  QC  phantom  and  the  human  brain.  We  then 
performed  several  experiments  to  evaluate  the  mathematical 
predictions.  These  experiments  confirmed  diat  an  imfUDvement 
can  always  be  obtained  by  using  tbe  optimal  parameters.  We 
found  that  among  the  basic  MRI  {votocols  currently  available 
on  clinical  systems,  a  sequence  of  four  MSE  and  an  IR  images, 
using  the  optimal  pulse  sequence  parameters,  would  generate 
excellent  (i.e.,  with  high  SNR)  brain  eigenimages. 

Maximization  of  the  NSNR  of  the  eigenimage  corresponds 
to  an  appropriate  splitting  of  the  time  between  the  partici- 
pating  pulse  sequences  to  maximize  the  “distinguishability” 
or  “dissimlarity”  of  the  desired  feature  from  the  interfering 
features  in  the  n-dimensional  feature  space  spanned  by  the 
pixel  vectors  defined  from  the  image  sequence.  /Vn  intuitive 
explanation  for  the  optimality  of  the  MSE  +  IR  combination  is 
as  follows.  The  MRI  signal  in  magnitude  reconstructed  images 
is  always  positive,  hence  the  most  dissimilar  signature  vectors 
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TABLE  VUI 

Experimental  NSNRs  of  the  brain  Eigenimages  and  the  Improvement  Rato  for  Nine  Vollnteers.  Considering 
Conventional  MRl  Parameters  ia  MSE  with  TE/TR=25/2.500  msec  and  a  SE  with  TE/rR=20/500  msec)  and  the 
Optimal  MRI  Parameters  (a  MSE  with  TE/TR=19/I.500  msec  and  an  IR  with  TE/n/TR=  12/5 19/2.000  msec) 


V5.V/?» 

.ySXRc 

XSXR,- 

Volunteer  #1 

con 

3.73 

2.49 

6.59 

opt 

6.21 

7.18 

9,5T 

ratio 

1.66 

2.88 

1.45 

Volunteer  #2 

con 

3.80 

3.27 

7.52 

opt 

5.53 

4.75 

7.93 

ratio 

1.45 

1.45 

1.05 

Volunteer  #3 

con 

4.60 

4.68 

9.74 

opt 

7.32 

9.06 

11.80 

ratio 

1.59 

1.93 

1.21 

Volunteer  #4 

con 

4.68 

4.12 

9.75 

opt 

7.68 

7.80 

10.28 

ratio 

1.64 

1.89 

1.05 

Volunteer  #5 

con 

4.86 

4.82 

7.24 

opt 

4.77 

6.88: 

8.71 

ratio 

0.98 

1.43 

1.20 

Volunteer  #6 

con 

5.76 

2.93 

18.00 

opt 

6.00 

7.31 

12.51 

ratio 

1.04 

2.49 

0.69 

Volunteer  #7 

con 

5.75 

3.02 

11.51 

opt 

6.38 

9.75 

11.30 

ratio 

l.H 

3.23 

0.98 

Volunteer  #8 

con 

7.92 

3.89 

11.65 

opt 

6.13 

5.31 

13.28 

ratio 

0.77 

1.40 

1.14 

Volunteer  #9 

con 

4.50 

2.62 

9.88 

opt 

4.75 

6.39 

13.69 

ratio 

1.05 

2.43 

1.38 

Mean  Values 

con 

5.06 

3.54 

opt 

6.08 

7.16 

11.01 

ratio 

1.26 

2.12 

1.13 

Standard  Deviation 

con 

1.21 

0.82 

3.23 

opt 

0.94 

1.52 

1.90 

ratio 

0.31 

0.64 

0.21 

vector.  The  IR  pulse  sequence  is  unique  in  that  it  can  generate 
an  image  in  which  the  average  gray  levels  from  a  tissue  is  close 
to  zero  (note  that  MSE  is  included  in  all  of  the  combinations). 
An  appropriate  selection  of  the  pulse  sequence  parameters  for 
this  protocol  (MSE  +  IR)  is  therefore  expected  to  generate  the 
best  NSNR  in  the  eigenimage. 

Although  we  considered  normal  brain  tissues  for  the  feasi¬ 
bility  studies  in  this  research,  the  optimization  method  will  be 
applicable  to  abnormal  tissues,  as  long  as  fair  estimates  of  the 
abnormal  tissue  parameters  are  available.  These  estimates  can 
usually  be  obtained  from  the  literature.  If  they  are  not  found 
in  the  literature,  they  need  to  be  estimated  once.  It  would 
be  possible  to  use  the  optimal  parameters  to  image  patients 
with  similar  conditions.  For  example,  optimal  protocol  for  the 
eigenimage  filtering  of  stoke  patients  can  be  established  by 
estimating  intrinsic  parameters  for  the  stroke  lesion  and  the 
normal  brain  tissues  from  typical  stroke  patients. 

It  should  be  noted  that  the  derivation  of  the  eigenimage  filter 
(and,  in  fact,  all  of  the  other  image  combination  techniques 
[3])  is  based  on  a  homogeneity  assumption,  that  is  the  average 
value  of  the  image  gray  levels  in  a  pure  ROI  (i.e.,  without 
including  any  partial  volume  pixels)  coincides  with  the  MRI 
signal  for  the  corresponding  tissue  (i.e..  if  there  was  no  noise  or 
anit.Tct  in  the  image).  In  other  words,  a  single  signature  vector 


is  enough  for  representing  a  tissue  type.  The  homogeneity 
assumption  may  not  hold  for  certain  types  of  lesions,  which 
we  refer  to  as  multiple-zone  lesions,  since  they  cannot  be 
characterized  by  a  single  signature  vector.  In  this  situation, 
using  the  average  of  the  signature  vectors  for  different  zones  of 
the  lesion  as  the  desired  signature  vector  and  the  surrounding 
normal  tissues  as  the  interfering  features  would  generate  an 
eigenimage  in  which  the  lesion  is  segmented  from  the  normal 
tissues.  Having  different  average  gray  levels  for  different 
zones  of  the  lesion  makes  it  possible  to  distinguish  them  from 
each  other.  One  approach  for  further  processing  would  be  to 
look  at  the  histogram  of  the  eigenimage  which  may  allow 
segmentation  of  different  zones  by  selecting  gray  level  cut¬ 
offs  at  the  histogram  valleys.  A  second  approach  would  be 
to  segment  the  entire  lesion  (and  generate  the  corresponding 
mask)  by  appropriately  thresholding  the  eigenimage,  and  then 
subimaging  the  original  images  with  the  mask,  and  finally 
running  the  eigenimage  filter  by  defining  one  zone  of  the 
lesion  as  the  desired  feature  and  the  other  zones  as  the 
interfering  features.  For  the  optimization,  one  may  optimize 
the  iegmentation  of  the  normal  tissues  from  one  another,  or 
the  segmentation  of  the  average  lesion  from  the  normal  tissues. 

Note  that  the  time  needed  for  the  estimation  of  the  tissue 
parameters  is  not  related  to  the  final  optimal  protocol.  The 
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reason  being  the  use  of  standard  protocols  for  the  estimation 

r  the  tissue  parameters.  We  are  not  optimizing  these  standard 
protocols. 

The  sensitivity  of  the  objective  function  to  the  tissue  param¬ 
eters  can  be  estimated  by  calculating  partial  derivatives  of  the 
objective  function  with  respect  to  each  of  the  tissue  parameters. 
It  can  be  empirically  found  by  calculating  the  difference 
between  the  theoretically  expected  value  of  the  objective 
function  with  the  experimental  one.  These  were  considered 
beyond  the  scope  of  this  paper  in  order  to  keep  the  length  of 
the  paper  reasonable.  In  addition  to  the  person-to-person  and 
the  section-to-section  variation  of  the  tissue  parameters,  there 
are  other  inaccuracies,  e.g.,  that  of  the  MRI  signal  model’s.  The 
practical  issue  is  the  over-all  effect  of  these  imperfections  on 
the  final  results,  which  has  been  examined  for  the  QC  phantom 
and  normal  brain.  Tables  VII  and  IX  compare  the  mathematical 
and  experimental  values  for  the  NSNRs  of  the  resulting  eigen- 
images,  and  illustrate  the  overall  sensitivity  of  the  objective 
function  to  all  of  the  imperfections.  Table  VII  shows  four 
combinations  of  MROI  pulse  sequences  (three  optimal  and  the 
conventional);  (a)  4  MSE  (TE/TR=3 1.45/498.7  msec)  and  1 
SE  (TE/TR=  10.0/1, 5(X)  msec);  (b)  4  mse  (TE/TR=22  13/500  1 
msec)  and  1  IR  (TEmi=20.0/776.7/2,122  msec);  (c)  4  MSE 
(TE/rR=35.33/500  msec)  and  1  GE  (TE/rR=5.0/500  msec 
0=39.38  deg);  (d)  4  MSE  (TE/rR=25/2,500  msec)  and  1  SE 
(TE/TR=20/500  msec).  Furthermore,  the  experimental  results 
of  imaging  nine  volunteers  in  Table  Vm  show  that  the  use 
of  the  optimal  pulse  sequence  has  improved  the  NSNR  for 
all  of  the  brain  eigenimages.  In  Table  VTO,  conventional 
MRI  parameters  are  an  MSE  with  TE/TR=25/2,500  msec 
and  an  SE  WITH  TE/TR=20/500  msec  and  the  optimal  MRI 
parameters  are  an  mse  with  TE/TR=  19/1, 500  msec  and  an 
IR  with  TE/TI/TR=12/519/2,000  msec.  An  underlying  reason 
for  the  ability  to  obtain  this  improvement  is  the  fact  that 
the  eigenimage  filter  does  not  propagate  the  aforementioned 
imperfections,  since  it  finds  and  uses  the  signature  vectors 
from  each  slice  of  the  acquired  data  and  not  from  the  MRI 
signal  models. 

An  important  result  of  our  investigation  was  that  on  av¬ 
erage  the  optimal  pulse  sequence  would  almost  double  the 
smallest  NSNR  of  the  brain  eigenimages,  as  compared  to 
the  conventional  brain  protocol.  This  indicates  that  using  the 
optimal  protocol  and  pulse  sequence  parameters  would  reduce  i 
the  imaging  time  for  the  eigenimage  filtering  of  brain  studies 
up  to  75%. 

The  optimization  technique  can  be  modified  to  find  the 
optimum  .MRI  protocols  and  pulse  sequence  parameters  for  1 

ters  (transformations)  other  than  the  eieenimage  filter.  For  ' 


1  example,  using  the  analytical  expressions  for  the  SNR  or  CNR 
i  of  the  transformed  images  given  in  [3J,  similar  optimization 
problems  can  be  formulated  and  solved,  yielding  the  opti¬ 
mum  MRI  protocols  and  pulse  sequence  parameters  for  the 
;  corresponding  filters. 

Finally,  the  optimization  technique  can  be  extended  so  that 
It  considers  the  recently  developed  pulse  sequences  for  fast 
MRI  scans  such  as  turbo  flash,  fast  spin-echo,  and  even  echo 
planar.  This  extension  will  make  it  possible  to  optimally  utilize 
a  v^ety  of  the  newest  MRI  pulse  sequences,  in  conjunction 
with  eigemmage  filtering,  to  assist  radiologists  in  extracting  the 

maximum  amount  of  information  from  an  increasingly  larger 
data  set.  o  z  & 

Appendix 

A.  Minimizing  Convex  Functions  with  Convex  Constraints 

In  the  following  we  show  that  for  a  convex  objective 
mnction  and  a  set  of  convex  constraints,  the  solution  to  the 

fixed-point  problem  in  (21)  is  the  minimizer  of  the  objective 
ninction. 

Assume  (x  ,  A  )  is  the  fixed  point  of  (21),  and  y  is  a 
fe^ible  point,  i.e.,  it  satisfies  all  of  the  constraints.  We  prove 
X  is  the  minimizer  by  showing  that  /(X"^)  <  /(y).  From 
the  convexity  assumption,  we  have 

/(y)  >  /(X^)  +  V/(r  )(y  -  (57) 

9j{y)>9i{^'^)  +  Vgj{^){y-^f^  j  =  l,...,7n.  (58) 
Since  At  >  0  and  gj{y)  <  0,  <  0  and  we  can  write 

/(y)  >  /(y)  +  A  g^(y).  (59) 

Substituting  the  right-hand  sides  of  (57)  and  (58)  into  the 
right-hand  side  of  (59),  we  obtain 

/(y)  >  fit)  -F  v/(r  )(y  -  + f  [g^(r ) 

>  fi^)  +  [Vfif)  +  f  Vg(^)](y  - 

(60) 

In  the  above  we  used  (26)  and 

[v/(r ) + Tvgjr  )](y  -  r)^  >  o.  (6i  > 


Inequality  (61)  holds  because  from  (21)  A  Vg 

'X  '  =  ^ 


N  .  -U 
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the  following  three  cases  should  hold;  (1)  i.e.. 

I'b,  <  I',  <  ub,:  (2)  .i~  =  lb,,  i.e.,  x,  <  lb,;  or  (3)  =  u6,-. 

i.e..  X,  >  lib,.  For  (1)  x~  =  0;  for  (2)  -x~  >  0  and 
>  lb,  =  x~.  i.e..  -x^ijh  -  x~]  >  0:  and  for  (3)  -x~  <  0 
and  !j,  <  ah,  =  x'^ .  i.e..  -jfl.iy,  -  x^]  >  0.  Q.E.D. 

B.  Gradient  Cak  ulatuni  by  Perrurhation  Method 

Let  us  Stan  with  the  calculation  of  VfQ. 

QiU  -r  Af)  =  I  ^  M'][(U  +  AD^ir  +  Af-')1“^ 

(U  +  \Uf 

=  I  -(U  +  AL')[r^L'^  + 

+  r^AD'  +  o(||A6n|2)]-i(L'  +  A£/y 
=  I-[U  +  \U)[{U'^U)-^  -  (U^U)-^ 
[^.U^U  +  U'^AU]{U^U)-^] 

(C/  +  AC/)^  +  o(||Af/||2) 

=  [/  -  U{U^U)-^U'^]  -  AU{U'^U)-^U^ 

-  U{U'^U)-^AU^  +  U{U'^U)-^ 

X  [AU'^U  +  U'^AU] 

(f/^f/)-'C/^  +  o(l|AC/||2) 


(62) 


Hence 


Vi  QAU'  =  -AU{U'^U)-^U'^  -  U{U'^U)-^AU'^ 

+  U{U'^U)-\AU'^U  +  U'^  AU]{U'^U)-^U'^ 

(63) 


..  _  J 1  if  f  =  r  and  j  ■=  s 


0  otherwise. 


(64) 


Sweeping  over  1  <  r  <  n  and  I  <  s  <  m'  generates  all  nm' 
columns  of  V^  Q.  Similarly, 

f(Q  +  AQ)  =  --^[d^iQ  +  ^Qf{Q  +  Ag)(i] 
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Here  AU'  =  [AUn.  AU21 . AUm'n]^  is  a  column  vector  [9) 

generated  by  sweeping  AU  row-by-row,  and  VuQ  is  a  x 
nm'  matrix  whose  (rs)th  column  is  found  by  substituting  AU 
by  Hjtg.  The  n  x  m'  matrix  Hjilii.j],  I  <  i  <  n,l  <  j  <  m'  [u] 
is  defined  as 


[12] 


[13] 


[14] 


=  -  ^  [d'^iQ^Q  +  AQ^Q  -F  Q^AQ  +  AQ^AQ)d\  C^l 


Thus 


=  --^[d^Q^Qd  +  (fiAQ'^Q  +  Q'^AQ)d 

+  o{\\AQ\\% 


Vq/AQ'  =  -  —  [d^iAQ^Q  +  Q^AQ)d\. 


(65) 


(66) 


Again  AQ'  is  generated  by  sweeping  AQ  row-by-row.  Using 
as  defined  below,  columns  (elements)  of  Vg/  are  found 
(note  that  Vg/  is  a  1  x  vector). 


where  1  <  r  :  <  n. 


1  if  i  =  r  and  j  =  s 
0  otherwise 


(67) 


[16] 

[17] 

[18] 

[19] 

[20] 
[21] 

[221 


References 

J.  P.  Windham.  M.  A.  Abd-Allah.  D.  A.  Reimann.  J.  W  Froelich.  and 
A.  M.  Haggar.  "Eigenimage  filtering  in  MR  imaging.  "  J  Compur.  Ass. 
Tomography,  vol.  12  no.  1.  pp.  1-9,  1988. 

H.  Soltanian-Zadeh.  J.  P,  Windham,  and  A.  E.  Yagle,  "Optimal  trans¬ 
formation  for  correcting  panial  volume  averaging  effects  in  magnetic 
resonance  imaging,"’  IEEE  Trans.  Nuc.  Sci.,  July/Aug.  1993. 

H.  Soltanian-Zadeh.  J.  P.  Windham,  D.  J.  Peck,  and  .A.  E.  Yagle.  "".A 
comparative  analysis  of  several  transformations  for  enhancement  and 
segmentation  of  magnetic  resonance  image  scene  sequences.  "  IEEE 
Trans.  Med.  Imag.,  vol.  11,  no.  3.  pp.  2-18,  Sept.  1992. 

H.  Soltanian-Zadeh,  J.  P.  Wmdham.  and  A.  E.  Yagle:  “Magnetic 
resonance  image  restoration  using  a  new  multi-dimensional  non-linear 
edge-preserving  filter."  Submittd  to  IEEE  Trans.  Imag.  Proc.. 

R.  E.  Hendrick.  F.  D.  Newman,  and  W.  R.  Hendee.  “MR  Imaging 
technology;  Maximizing  the  signal-to-noise  ratio  from  a  single  tissue,"' 
Radiology,  vol.  13,  no.  3,  pp.  Td9-TS2,  1985. 

C.  B.  Ahn,  S.  Y.  Lee,  O.  Nalcioglu,  and  Z.  H.  Cho,  “An  improved 
nuclear  magnetic  resonance  diffusion  coefficient  imaging  method  using 
an  optimized  pulse  sequence,”  Med.  Phys.,  vol.  13.  no.  6.  pp.  789-793. 
Nov  ./Dec.  1986. 

W.  L.  Greif,  R.  B.  Buxton,  R.  B.  Lauffer  et  al..  “Pulse  sequence  op¬ 
timization  for  MR  imaging  using  a  paramagnetic  hepatobiliary  conoast 
agenu”  Radiology,  vol.  157,  pp.  461-466.  1985. 

C.  N.  de  Graaf  arid  C.  J.  G.  Bakker,  “Simulation  procedure  to  determine 
nuclear  magnetic  resonance  imaging  pulse  sequence  parameters  for 
optimal  tissue  contrast,”  J.  Nuclear  Med.,  vol.  27,  no.  2.  pp.  281-286, 
Feb.  1986. 

R.  M.  Henkelman  et  al.,  “Optimal  pulse  sequence  for  imaging  hepatic 
metastases,”  Radiology,  vol.  161,  pp.  727-734,  1986. 

M.  R.  Paling  et  al.,  “Liver  metastases:  Optimization  of  MR  imaging 
pulse  sequences  at  1.0  T.”  Radiology,  vol.  167,  pp.  695-699,  1988. 

C.  J.  Fretz  et  al..  “Superparamagnetic  iron  oxide-enhanced  MR  imaging: 
Pulse  sequence  optimization  for  detection  of  liver  cancer.”  Radiology, 
vol.  172,  pp.  393-397,  1989. 

W.  Dreher  aixl  P.  Bomert  “Pulse  sequence  and  parameter  choice  in 
NMR  imaging  as  a  problem  of  constrained  multidimensional  nonlinear 
optiraizatioa.”  Magnetic  Resonance  in  Med.,  vol.  8,  pp.  16-24,  1988. 
M.  R.  Mitchell,  T.  E.  Conturo,  T.  J.  Gruber,  and  J.  P.  Jones,  “Two 
computer  models  for  selection  of  optimal  magnetic  tesonaixte  imaging 
(MRl)  pulse  sequence  timing,”  Investigative  Radiology,  pp.  349-360, 
Sept/Oct  1984. 

L.  E.  Quint  et  al.,  “In  vivo  and  in  vitro  MR  imaging  of  renal  turnon: 
Histopathologic  correction  atxi  pulse  sequence  optimization,”  Radiology, 
vol.  169,  pp.  359-362,  1988. 

H.  Iwaoka,  T.  Hinta.  and  H.  Matsuura,  “Optimal  pulse  sequences  for 
magnetic  resonance  imaging-computing  accurate  Tl,  T2,  and  proton 
density  images,”  IEEE  Trans.  Med.  Imag.,  vol.  6,  no.  4,  pp.  360-369, 
Dec.  1987. 

G.  Bielke,  “A  method  for  optimizatioa  of  pulse  sequence  in  NMR 
imaging,”  Med.  Progress  through  Techno!.,  vol.  10,  pp.  171-176.  1984. 
J.  N.  Lw  and  S.  J.  Riederer,  “Optimum  acquisition  times  of  two 
spin  echoes  for  MR  image  synthesis,”  Magn.  Reson.  Med.,  vol.  3,  pp. 
634-638,  1986. 

E.  R.  McVeigh,  M.  J.  Bronskill,  and  R.  M.  Henkelman,  “Optimization 
of  MR  protocols:  A  statistical  decision  analysis  approach,”  Magn.  Res. 
Med.  vol.  6.  pp.  314-333,  1988. 

R.  E.  HendricL  “Image  contrast  and  noise,”  Magnetic  Resonance 
Imaging,  First  ed..  Stark  atxi  Bradly,  Eds.,  St  Louis,  MO:  Mosby-Year 
Book,  Inc.,  pp.  66-83.  1988. 

F.  W.  Wehrli,  J.  R.  MacFall  et  al.,  “Mechanism  of  contrast  in  NMR 
imaging,”  J.  Comput.  Assist.  Tomogr..  vol.  8.  no.  3,  pp.  369-380,  1984. 
F.  W.  Wehrli,  R.  K.  Breger  et  al.,  “Quantification  of  contrast  in  clinical 
MR  brain  imaging  at  high  magnetic  field,”  Inves.  Radiology,  vol.  20. 
no.  4,  pp.  360-369,  1985. 

B.  C.  Eaves  and  R.  Saiga!,  “Homotopies  for  computing  fixed  points  m 
unbounded  regions."'  .Uath.  Programming,  vol.  3,  pp.  225-23'.  19’2 


APPENDIX  R2 

H.  Soltanian-Zadeh,  A.E.  Yagle,  J.P.  Windham,  and  D.O.  Hearshen,  “Op¬ 
timization  of  MRI  Protocols  and  Pulse  Sequence  Parameters  for  Eigenimage 
Filtering,”  IEEE  1992  Medical  Imaging  Conference,  Orlando,  FL,  Oct.  25-31, 
1992,  pp.  1325-27. 

This  is  the  conference  paper  version  of  Appendix  Rl. 


Optimization  of  MRI  Protocols  and  Pulse  Sequeu 
Parameters  for  Eigenimage  Filtering 

Hamid  Soltanian-Zadeh^’'^  Andrpin  P  vv.  /  i  r  n  n- 
■The  U„,ve..,v  of  Michigan,  ,4n„  Acboc.'M,  F^d  Hosptro",:;': '  ,, 


Abstract 

generates  a  composite  image  in  which 

fSNRT  of  ?h  '  The  signal-to-noise  ratio 

(bNR)  of  the  eigemmage  is  directly  proportional  to  the 
dissimilarity  between  the  desired  and  interfering  features 
Since  image  gray  levels  are  analytical  function  of  MRI 

optumzing  these  parameters.  For  optimization  we  lol 

spt-echo  S  ^'q'^ences:  multiple  spin-echo  (MSB); 

GE)  We  use  ihT'T  g^^i«nt-echo 

11  •  niathematical  expressions  for  MRI  sie 

nals  dong  with  intrinsic  tissue  parameters  to  exprSs  the 

objac,.v.  ,nsnr  a. 

MM  parmelm  Tie  objective  fanclioo  along  with  ,  ,et 
dimensional  non-linear  constrained  optimization  problem 

L^teThn  '  Th”47i^’ 

pLtom  3'h“  “‘"’“St  ite  “pplication  to 

phMtom  and  brain  images.  We  found  that  the  optimal 

pulse  sequence  parameters  for  a  sequence  of  four  MSE^ 
of  the  doubles  the  normalized  SNR  (NSNR) 

b,ai.  Xcrr’™*"'  “  “>  ““ventio  J 

I-  introduction 

I  m°,,?  ““/f  >’"“<’“1"  htteteet  during  the  laet 

years  In  derivmg  optimal  MRI  protocols  and 

lor  several  figures-of-merit  (POM) 

tio  fCNRi^?^  contrast-to-noise  ra- 

“o  (CNR)  [2]  (sometimes  normalized  to  the  square  root 

ftiLrfSf.  ‘,T  St^Uent^paranTj 

—..  .‘““'““y  “f  the  calcniWed  ti,«,e  pa- 

M  diagn  Jlic  teeu"  «y"‘l>«.i.«i  image  (6],-  and 

««X7roFi  “  '‘“:T  f"”  i«  that 

-»"a°L‘s?R  nTto  of.®  ■’'*  »“• 

ruiRn,  ^  compostu  image  (eigenim- 

®-7803-0883-2/93/$3.00  ©1993  TFFF 


age)  generated  by  a  hnear  combination  of  seve-^ 

(an  MRI  scene  sequence)  which  are  acquired  usm. 

f  I  optunizationV;o;: : 

(*v)  (v),  and  (vi)  above,  in  that  for  all  of  these  nmc, . 

mulMe  pulse  sequences  are  used  and  hence  there"  I  ^ 
oral  parameters  to  be  optimized. 
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nature  vectors,  we  have  shown  that  the  SNR  of  th  . 
unage  is  directly  proportional  to  [8j:  ^ 

.b«wrh‘.rf  “f ‘"■PP’vi.g  item  1  cbc. 

here  [9j,  here  we  consider  item  2.  Since  the  image  gray 

tatT  of  MRI  protocols  an5  pulse 

sequence  parameters,  we  are  able  to  maximize  the  dissim 

S.fhr“  “f*  ““  future,  bTf 

tin^mg  MW  ptolncnl.  „d  pnkc  .cqncncc 

.pprmS.  m  L  f'?  '“‘““'•“o"  ““<1  the 

•ppro^  to  Und  .  mlution.  In  Section  III,  we  oicent 

(Orl  ‘if  .“■*  “1’““™"““  '““It*  for  a  ouiility  control 

(QC)  plnmtom  „d  the  hunnu.  brmn.  In  Sect  J  Iv  f 

give  a  summary  and  conclusions.  ’ 


II.  METHODS 


A.  Problem  Formulation 

For  optimization,  we  consider  MRI  scene  sequences  gener- 

“‘I  P^«  “qo'oces: 
(t)  multiple  spm-echo  (MSE);  (it)  spin-echo  (SE);  (ii,)  fo- 
^rsion  recovery  (IR);  and  (iv)  gradient-echo  (GE).  fL  a 
nun^r  of  images  in  the  sequence,  acquired  by  a  cer- 
^  combmation  of  these  MRI  protocols,  we  use:  (i)  the 
^themati^  expression  of  the  MRI  signal  from  a  tissue 
(an  element  of  a  signature  vector)  [10];  and  (ii)  intrinsic 
tissue  parameters  (N(H),  Tl,  and  T2)  [11]  to  express  the 
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toCio.  /  (NSNR  of  Iho 

the  MRI  pulse  sequence  parameters  (TE,  TI,  TR,  p 

^°The  objective  function,  along  with  a  set  of  diagnostic  or 
instrumental  constraints  on  pulse  sequence  parameters,  de¬ 
fine  a  multi-dimensional  non-linear  constrained  optimiza¬ 
tion  problem.  Assuming  the  constraints  are  consistent, 
there  will  be  a  domain  of  pulse  sequences  and  pulse  se¬ 
quence  parameters  from  which  the  optimal  procedure  is 
extracted. 

The  formulated  optimization  problem  is  therefore 


Maximize  /(x) 

f  lbi<Xi<ubi, 

subject  to  jj,.(x)<0,  j  = 

where  /  is  the  NSNR  of  the  eigenimage,  x  contains  pulse 
sequence  parameters,  Ibi  <  Xi  <  ubi,  i  =  i^i* 

plement  lower  and  upper  bounds  (si  is  a  varying  pa¬ 
rameter,  p  is  the  number  of  varying  parameters,  and  Ibi 
and  ubi  are  lower  and  upper  bounds,  respectively.),  and 
gj{x)  <  0,  j  =  are  hnear  constraints  (m  is  the 

number  of  required  relationships  among  the  parameters). 


B.  Solution 

We  translate  the  formulated  optimization  problem  in  (1) 
into  a  fixed  point  problem.  We  then  solve  the  fixed  point 
problem  using  the  method  of  Eaves  and  Saigal  [12],  which 
is  the  only  available  fixed  point  algorithm.  It  is  a  numeri¬ 
cal  method  for  solving  a  multi-dimensional  system  of  equa¬ 
tions  in  the  form  /(x)  =  x  defined  from  IT  to  TV'.  This 
algorithm  uses  a  methodology  based  on  triangulating  the 
space  (as  in  finite  element  methods)  and  following  a  i«th 
of  solutions  to  a  minimum.  This  method  is  ideally  suited 
for  non-differentiable  optimization,  and  thus  cm  handle 
inequadity  md  equality  constraints  without  explicitly  con¬ 
sidering  the  active  sets.  For  other  methods,  this  cm  be 
a  problem  [12].  It  is  robust  in  passing  the  local  minma 
and  giving  the  global  minimum,  except  when  there  exists 
a  large  peak  between  the  local  md  global  minima.  Its  rate 
of  convergence  is  quadratic  md  is  thus  fast. 


III.  RESULTS 

The  optimization  technique  was  applied  to  the  QC  phm- 
tom  md  the  humm  brain.  For  each  combination  of  MRJ 
pulse  sequences,  the  optimal  parameters  were  found  md 
the  NSNRs  of  the  resulting  eigenimages  were  mathemati¬ 
cally  predicted.  These  results  are  sununarized  in  Tables  1 
md  2.  It  is  seen  that  a  sequence  of  four  MSE  md  a  GE, 
with  optimal  pulse  sequence  parameters,  is  expected  to 
generate  phmtom  eigenimages  with  largest  NSNRs.  Sim¬ 
ilarly,  a  sequence  of  four  MSE  md  m  IR,  with  optimal 
pulse  sequence  parameters,  is  expected  to  generate  brain 
eigenimage  with  largest  NSNRs. 


Table  1:  NSNRs  of  the  QC  phantom  eigenimages,  con¬ 
sidering  optimum  parameters  for  three  combinations  of 
MRI  pulse  sequences:  (a)  4  MSE  (TE/TR  =  31.45/498.7 
msec)  and  1  SE  (TE/TR  =  10.0/1500  msec);  (b)'4  MSE 
(TE/TR  =  22.13/500.1  msec)  and  1  IR  (TE/TI/TR  = 
20.0/776.7/2122  msec);  (c)  4  MSE  (TE/TR  =  35.33/500 
msec)  and  1  GE  (TE/TR  =  5.0/500  msec,  a  =  39.38  deg). 


NSNRyi 

NSNRb 

NSNRc 

a 

Mathematical 

9.89 

12.26 

10.06 

Experimental 

13.83 

9.34 

9.03 

b 

Mathematical 

7.58 

9.12 

6.80 

Experimental 

6.13 

9.28 

8.10 

c 

Mathematical 

14.04 

13.89 

14.28 

Experimental 

20.66 

11.94 

9.60 

Table  2:  NSNRs  of  the  brain  eigenimages,  considering  op¬ 
timum  parameters  for  three  combinations  of  MRI  pulse 
sequences:  (a)  4  MSE  (TE/TR  =52.55/2499  msec)  and 
1  SE  (TE/TR  =  10.0/500  msec);  (b)  4  MSE  (TE/TR  = 
19.00/1500  msec)  md  1  IR  (TE/TI/TR  =  12.0/519.0/2000 
msec);  (c)  4  MSE  (TE/TR  =  53.51/2467  msec)  md  1  GE 
(TE/TR  =  5.0/500  msec,  a  =  111.95  deg). 


NSNRw 

NSNRg 

NSNRc 

a 

Mathematical 

5.78 

4.59 

20.55 

b 

Mathematical 

11.28 

11.77 

14.39 

Experimental 

10.21 

12.57 

13.24 

c 

Mathematical 

5.94 

4.76 

21.00 

Several  experiments  were  performed  to  evaluate  the 
mathematical  predictions.  The  QC  phmtom  was  imaged 
xiPiTig  each  set  of  pulse  sequence  parameters  given  in  Ta¬ 
ble  1.  Eight  humm  volunteers  were  imaged  using  both 
the  conventional  md  the  optimal  pulse  sequence  parame¬ 
ters  for  a  sequence  of  four  MSE  md  m  IR  given  in  Table 
2.  For  each  experiment,  three  eigenimages  were  generaUd 
md  their  NSNRs  were  estimated.  Original  md  eigenim¬ 
ages  of  the  QC  phmtom  md  two  humm  brains  were  shown 
in  the  presentation,  we  omit  them  here  due  to  the  page 
limitation.  Tables  1  md  2  compare  the  mathematical  md 
experimental  NSNRs  of  the  resulting  eigenimages.  Table  3 
compares  the  resulting  NSNRs  of  eight  volunteers’  eigen¬ 
images  using  conventional  MRI  pulse  sequence  parameters 
for  brain  studies  with  those  using  the  optimal  MRI  pulse 
sequence  parameters.  The  ratio  of  the  NSNR  of  the  eigen¬ 
image  generated  using  optimal  parameters  to  that  of  con¬ 
ventional  parameters  illustrates  the  improvement  attained 
as  a  result  of  optimization.  This  ratio  is  1.54  ±  0.36  for 
white  matter,  2.75  ±  1.30  for  gray  matter,  md  1.19  ±  0.31 
for  cerebrospinal  fluid  (CSF). 
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Table  3:  Experimental  NSNRs  of  the  brain  eigenimages 
and  the  improvement  ratio  (ratio),  considering  conven¬ 
tional  (con)  MRJ  parameters  (a  MSE  with  TE/TR  = 
25/2500  msec  and  a  SE  with  TE/TR  =  20/500  msec)  and 
the  optimal  (opt)  MRI  parameters  (a  MSE  with  TE/TR  = 
19/1500  msec  and  an  IR  with  TE/TI/TR  =  12/519/2000 
msec). 


NSNRw 

NSNRg 

NSNRc 

mean 
ji  values 

Ij  (8  volunteers) 

con 

6.86 

5.13 

11.77 

opt 

10.21 

12.57 

13.24 

ratio 

1.54 

2.75 

1.19 

1  standard 
deviations 

1  (8  volunteers) 

con 

1.32 

1.70 

3.14 

opt 

2.06 

2.97 

2.32 

ratio 

0.36 

1.03 

0.31 

IV.  SUMMARY 

We  optimi2ed  MRI  protocols  and  pulse  sequence  param¬ 
eters  for  the  eigenimage  filtering.  We  formulated  the 
maximization  of  the  NSNR  of  the  eigenimage  as  a  multi¬ 
dimensional  non-linear  constrained  optimization  problem, 
which  we  solved  by  the  fixed  point  approach. 

We  found  the  mathematical  predictions  for  the  opti¬ 
mal  MRI  parameters  for  the  QC  phantom  and  the  human 
brain.  We  then  performed  several  experiments  to  evalu¬ 
ate  the  mathematical  predictions.  These  experiments  con- 
rmed  that  an  improvement  can  always  be  obtained  by  us¬ 
ing  the  optimal  parameters.  We,  however,  observed  some 
differences  between  the  mathematical  and  experimental  re¬ 
sults.  We  attribute  these  differences  to  the  model  inaccu¬ 
racies  and  the  error  in  estimating  NSNRs  and  tissue  pa¬ 
rameters  as  well  as  the  person-to-person  variation  of  the 
ssue  parameters. 

The  final  outcome  of  the  investigation  was  that  on  aver¬ 
age  the  optimal  pulse  sequence  almost  doubled  the  NSNR 
f  the  brain  eigenimages,  as  compared  to  the  conventional 
rain  protocol.  This  indicates  that  using  the  optimal  pro- 
ocol  and  pulse  sequence  parameters  can  reduce  the  imag- 
xg  time  for  the  eigenimage  filtering  of  brain  studies  bv 
■5%. 
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APPENDIX  S 

H.  Soltanian-Zadeh,  J.P.  Windham  and  A.E.  Yagle,  “A  Multidimensional 
Non-Linear  Edge-Preserving  Filter  for  Magnetic  Resonance  Image  Restora¬ 
tion,”  to  appear  in  IEEE  Trans.  Image  Proc. 

Although  the  edge-preserving  filter  with  locally- varying  properties  was  designed  specif¬ 
ically  for  MRI,  it  may  have  applications  elsewhere. 
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A  Multidimensional  Nonlinear  Edge-Preserving 
Filter  for  Magnetic  Resonance  Image  Restoration 

Hamid  Soltanian-Zadeh.  Member.  IEEE.  Joe  P.  Windham,  Member.  IEEE,  and  Andrew  E.  Yagle,  Member.  IEEE 


Abstract— This  paper  presents  a  multidimensional  nonlinear 
edge-preserving  filter  for  restoration  and  enhancement  of  mag¬ 
netic  resonance  images  (MRI).  The  filter  uses  both  interframe 
(parametric  or  temporal)  and  intraframe  (spatial)  information 
to  filter  Ifi®  additive  noise  from  an  MRI  scene  sequence.  It 
combines  the  approximate  maximum  likelihood  (equivalently, 
least  squai^)  estimate  of  the  interframe  pixels,  using  MRI  signal 
models,  with  a  trimmed  spatial  smoothing  algorithm,  using  a 
Euclidean  distance  discriminator  to  preserve  partial  volume  and 
edge  information.  (Partial  volume  information  is  generated  from 
voxels  conUining  a  mixture  of  different  tissues.)  Since  the  filter’s 
structure  is  parallel,  its  implementation  on  a  parallel  processing 
computer  is  straightforward.  Details  of  the  filter  implemenUtion 
ror  a  sequence  of  four  multiple  spin-echo  images  is  explained,  and 
the  effects  of  filter  parameters  (neighborhood  sue  and  threshold 
value)  on  the  computation  time  and  performance  of  the  filter 
is  discussed.  The  filter  is  applied  to  MRI  simulation  and  brain 
studies,  serving  as  a  preprocessing  procedure  for  the  eigenimage 
filter.  (The  eigenimage  filter  generates  a  composite  imag*  in 
which  a  feature  of  interest  is  segmented  from  the  surrounding 
interfering  features.)  It  outperforms  conventional  pre  and  post¬ 
processing  filters,  including  spatial  smoothing,  low-pass  filtering 
with  a  Gaussian  kernel,  median  filtering,  and  combined  vector 
median  with  average  filtering. 


I.  Nomenclature 

For  the  mathematical  developments,  we  use  an  n- 
dimensional  vector  space  (R",  R),  where  n  is  the  number 
of  images  in  the  MRI  scene  sequence.  For  example,  when 
dealing  with  an  MRI  scene  sequence  consisting  of  a  Tl- 
weighted  and  four  T2-weighted  spin-echo  images,  we  use 
a  5-D  vector  space.  Using  the  vector  space  concept,  the 
following  representations  are  introduced.  The  MRI  scene 
sequence  is  represented  by  pixel  vectors.  A  pixel  vectw 
Prk  =  [Pjki  P]k2  •  Pjkn]^  is  a  vector  whose  elements  are 
the  corresponding  gray  levels  of  the  (j,  A:)th  pixels  in  the  MR 
images  (see  Fig.  1).  The  image  size  determines  the  number 
of  these  pixel  vectors,  e.g.,  for  256  x  256  images,  there  are 
2*^  pixel  vectors.  The  MRI  characteristics  of  tissue  types  are 
represented  by  signature  vectors.  For  image  analysis,  one  is 
twnnally  interested  in  clearly  visualizing  one  of  the 

Manuscript  received  May  24.  1992;  revised  October  13,  1993.  Thia  woik 
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types  (which  is  referred  to  as  the  desired  feature),  whereas 
other  tissue  types  (which  are  referred  to  as  the  undesired  or 
interfering  features)  interfere  with  its  visualization.  A  desired 
signature  vector  d  =  [di  dj  •  ■  ■  is  defined  as  a  vector 
whose  tth  element  is  the  average  gray  level  of  the  desired 
feature  in  the  ith  image.  Undesired  (interfering)  signature 
vectors  iii  =  [un  U2j  •  •  •  1  <  i  <  m  are  similarly 

defined  for  the  interfering  features.  Finally,  vectors  and 
R<m  are  pixel  vectors  from  the  desired  feature  at  location  (j.  k) 
and  the  undesired  feature  at  location  (/,  m),  respectively.  These 
notations,  as  well  as  those  defined  elsewhere  in  the  paper,  are 
summarized  in  a  list  below  the  following  list  of  abbreviations. 


1)  List  of  Abbreviations 


AVG 

average 

CNR 

contrast-to-noise  ratio 

SNR 

signal-to-noise  ratio 

CSF 

cerebrospinal  fluid 

DROI 

desired  feature  ROI 

UROI 

undesired  feature  ROI 

EPV 

estimated  partial  volume 

OPV 

original  partial  volume 

LS 

least  squares 

MLE 

maximum  likelihood  estimate 

MR 

magnetic  resonance 

MRI 

magnetic  resonance  imagingfimages 

NS 

neighborhood  size 

Pd 

ixrobability  of  detection 

Pf 

probability  of  false  alarm 

ROI 

regitMi  of  interest 

2)  List  of  Notations 

CNRi 

CNR  between  the  desired  and  the  »th  interfering 

d 

feature 

desired  feature  signature  vector 

e 

weighting  vector  for  die  eigenimage  filter 

EH 

expected  value  operator 

E[.] 

expected  value  estimate  (sample  mean) 

Eljk 

gray  level  of  the  {j,  k)th  pixel  in  the  eigenimage 

m 

number  of  interfering  features  in  the  scene 

n 

number  of  unages  in  the  MRI  scene  sequence 

N 

number  of  pixels  in  the  DROI 

Prk 

gray  level  of  dw  O',  *)th  pixel  in  an  image 

Pjki 

gray  level  of  the  (j,  k)th  pixel  in  the  ith  image 

Pjk 

pixel  vector,  i.e.,  an  n-dimensional  vector  whose 
ith  element  is  Pjki 
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P%  g''ay  level  of  the  O',  ^)th  pixel  in  the  DROI  of  an 
imaee 

pixel  vector  in  the  DROI 

gray  level  of  the  {j.  k)ih  pixel  in  an  UROI  of  an 

image 

P'^i.  gray  level  of  the  O', '^■)th  pixel  in  the  zth  UROI  of 
an  imaee 

—  u , 

Pjk  pixel  vector  in  the  /th  UROI 
(T  standard  deviation  of  white  noise 
SNRd  SNR  of  the  desired  feature 
5,  MRl  signal  from  the  ith  material 
u  undesired  feature  signature  vector 

tti  undesired  feature  signature  vector 

Var(  )  variance  operator 
Var(  )  variance  estimator  (sample  variance) 

Vj  partial  volume  of  the  /th  material  in  a  voxel 

Vijk  partial  volume  of  the  /th  material  in  the  (J,k)th 

voxel 

V  total  volume  of  a  voxel 

w  weighting  vector  for  a  linear  filter 
Wjk  zero-mean  white  noise  at  the  (j,  lfc)th  pixel  of  an 
image 

Wjki  zero-mean  white  noise  at  the  (j,  fc)th  pixel  of  the 
ith  image 


II,  Introduction 

Additive  noise  in  magnetic  resonance  imaging  (MRI)  limits 
correct  identification  and  quantitative  measurements  of  normal 
and  pathological  tissues.  This  is  a  problem  for  human  viewers 
as  well  as  computer  vision  and  automatic  analysis  methods 
such  as  image  segmentation  and  analysis  using  a  modified 
matched  filter  [1]  and  the  eigenimage  filter  [2H4].  The 
eigenimage  filter  will  be  briefly  described  in  Section  III-E. 

In  MRI  clinical  studies,  a  sequence  of  images  of  the  same 
anatomical  site  is  usually  acquired,  which  we  refer  to  as  an 
MRI  scene  sequence.  (This  is  similar  to  multispectral  images 
acquired  in  the  field  of  remote  sensing.)  The  noise  in  an 
MRI  scene  sequence  is  characterized  by  an  additive  zero- 
mean  white  Gaussian  noise  field  that  is  unconelated  between 
different  frames  [5]-[7]. 

Averaging  several  acquisitions  or  free  induction  decay  sig¬ 
nals,  which  are  used  to  reconstruct  magnetic  resonance  (MR) 
images,  is  the  conventional  method  for  reducing  the  additive 
noise.  This  method  has  the  following  practical  difficulties; 

1)  It  increases  the  imaging  time  and  cost. 

2)  It  limits  the  patient  throughput. 

3)  It  requires  image  registration  to  compensate  for  the 
patient  movements. 

An  alternative  approach  is  restoration  of  the  acquired  MR 
images,  which  is  performed  off-line  without  having  the  afore¬ 
mentioned  difficulties.  However,  conventional  image  restora¬ 
tion  filters  found  in  the  image  processing  literature,  e.g., 
Wiener,  median,  and  low-pass  filters,  are  not  specifically 
designed  for  MRI.  As  such,  they  neither  use  all  of  the  available 
information  in  MRI  nor  consider  the  MRI  specific  require¬ 
ments.  e.g.,  preserving  partial  volume  and  edge  information. 
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The  concept  of  partial  volume  and  its  basis  will  be  explained 
in  Section  III-C. 

Recent  developments  in  nonlinear  filters  for  computer  vi¬ 
sion.  e.g.,  scale  space  and  edge  detection  using  anisotropic 
diffusion  [8]  and  adaptive  smoothing  [9],  are  indications  of 
a  need  for  new  nonlinear  methods  to  preserve  edges  while 
suppressing  the  additive  noise.  Both  of  these  filters  use  a  3  x 
3  neighborhood  and  a  nonlinear  function  of  the  signal  gradient 
to  preserve  edges.  They  are  most  appropriate  for  processing  a 
single  image  with  sharp  transitions  between  different  regions. 

In  an  MRI  scene  sequence,  there  are  usually  partial  volume 
pixels  that  generate  a  smooth  transition  between  different 
regions.  In  addition,  there  are  several  images  that  may  be 
processed  simultaneously.  Therefore,  a  new  multidimensional 
nonlinear  edge-preserving  filter  specifically  designed  for  MRI 
seems  necessary.  We  have  developed  such  a  filter  that  uses  t 
both  intrafirame  (spatial)  and  interframe  (parametric  or  tempo¬ 
ral)  information  to  filter  the  noise  while  preserving  edge  and  | 
partial  volume  information. 

The  new  filter  uses  MRI  signal  models  to  implement  an  ^ 
approximate  maximum  likelihood  or  least  squares  estimate 
of  each  pixel  gray  level  from  the  gray  levels  for  the  same 
location  in  all  of  the  images  in  sequence;  this  coirespionds  to 
using  interframe  information.  It  also  employs  a  trimmed  mean 
spatial  smoothing  algorithm'  that  uses  a  Euclidean  distance 
discriminator  to  preserve  partial  volume  and  edge  information; 
this  corresponds  to  using  intraframe  information. 

Trimming  data  by  the  new  filter  is  reminiscent  of  that  previ¬ 
ously  used  in  generalized  order  statistic  filters  [10H15]  and  in 
combination  with  segmentation  algorithms  [16].  Nonlinearity 
and  adaptivity  of  the  new  filter  is  similar  to  that  used  for 
amsotropic  diffusion  [8]  and  adaptive  smoothing  [9].  However, 
there  are  some  differences  between  the  new  filter  and  previous 
methods.  The  most  important  differences  are  the  following: 

1)  It  uses  a  Euclidean  distance  discriminator  (i.e.,  the 
optimal  solution  to  a  binary  detection  problem  [17]). 

2)  It  puts  no  restriction  on  the  number  of  trimmed  Hata 
points. 

3)  It  uses  a  step  function  as  its  nonlinearity. 

4)  It  finds  the  edge  of  the  step  function  by  calculating 
{Hobabilities  of  detection  and  false  alarm. 

In  Section  HI,  we  review  the  MRI  signal  model  for  multi¬ 
ple  spin-echo  images  and  describe  interframe  and  intraframe 
information,  partial  volume  averaging,  and  signal-to-noise 
ratio  (SNR)  and  contrast-to-noise  ratio  (CNR)  as  defined  and 
estimated  in  the  MRI  literature.  We  also  briefly  review  the 
eigenimage  filter.  In  Section  IV,  we  explain  flie  relationship 
between  the  new  filter  and  the  generalized  order  statistic 
filters  and  adaptive  smoothing  meduxis.  In  Section  V,  we 
explain  details  of  the  new  filter  iiKluding  the  implementaticui 
of  the  approximate  maximum  likelihood  and  least  squares 
estimates  and  calculation  of  die  i»obabilides  of  detection 
and  false  alarm.  In  Section  VI,  we  apply  the  new  filter  to 
preprocess  multiple  spin-echo  images  of  a  simulation  and  * 

*  Trimmed  mean  spatial  smoothing  refers  to  the  idea  of  ignoring  some  data 
points  in  a  neighborhood  around  each  pixel  as  being  irrelevant  and  using  the 
sample  average  of  the  remaining  data  points  as  an  estimate  for  the  point  in 
the  center  of  the  neighborhood. 
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3  brain  study  before  the  eigenimage  filtering.  We  illustrate 
effects  of  neighborhood  size  and  threshold  value  on 
il,g  CNR  and  appearance  of  the  resulting  eigenimage  using 
simulated  and  acquired  brain  images.  We  then  compare  the 
performance  of  the  new  filter  to  that  of  several  conventional 
pre-  and  post-processing  methods,  including  spatial  smoothing. 
[Qvv-pass  filtering  with  a  Gaussian  kernel,  median  filtering, 
3fid  combined  vector  median  with  average  filtering.  Finally, 
illustrate  the  preservation  of  edge  and  panial  volume 
information  by  the  new  filter  using  a  1-D  signal  extracted  from 
the  simulation  and  volume  calculations  of  the  central  region 
I  simulated  white  matter)  in  the  simulation  and  the  egg  white 
i.nd  egg  yolk  in  an  egg  phantom.  Conclusions  and  comments 
given  in  Section  VII. 


B.  Inter  and  Intraframe  Information 

The  model  in  (2)  shows  the  relationship  between  corre¬ 
sponding  pixels  from  different  images,  which  we  refer  to 
as  interframe  (parametric-  or  temporal)  information.  This 
model  provides  a  means  for  obtaining  a  least-squares  (LS)  or 
maximum  likelihood  estimate  (MLE)  of  the  pixel  intensities. 
The  estimation  details  will  be  explained  in  Section  V-A. 

Intraframe  (spatial)  information  refers  to  the  relationship 
between  pixels  from  a  panicular  tissue,  which  is  provided  by 
anatomical  structures  visualized  in  the  image.  This  information 
exists  because  each  tissue  type  has  several  connected  pixels 
in  the  image.  Our  approach  to  utilize  this  information  will  be 
explained  in  Section  V. 


III.  Background 


/l.  Multiple  Spin-Echo  Model 

To  acquire  MR  images,  a  sequence  of  radio-frequency  (RF) 
pulses  with  certain  shapes,  energy  levels,  and  timing  intervals 
are  applied  while  the  object  being  imaged  is  immersed  in  a 
sutic  magnetic  field.  The  resulting  average  image  gray  level, 
corresponding  to  a  specific  tissue  type,  is  a  function  of  both  the 
intrinsic  tissue  parameters  and  the  parameters  of  the  RF  pulse 
sequence.  One  of  the  most  commonly  used  pulse  sequences 
is  the  multiple  spin-echo  (which  is  also  referred  to  as  CPMG) 
[7],  which  can  generate  several  images  (clinically  utilized  to 
generate  two  or  four  images)  in  a  single  acquisition  setup.  An 
MRI  scene  sequence,  which  is  defined  by  n  multiple  spin- 
echo  images  of  a  slice  with  k  tissue  types  in  the  scene,  can  be 
thought  of  as  a  set  of  n-dimensional  random  vectors,  each 
pertaining  to  one  of  k  uncorrelated  Gaussian  distributions. 
The  MRI  pulse  sequence  and  tissue  parameters  determine  the 
standard  deviation  of  each  distribution  and  the  relationship 
between  their  mean  vectors. 

Here,  we  review  the  theoretical  model  for  the  image  gray 
levels  generated  by  a  multiple  spin-echo  sequence  with  n 
echoes  in  terms  of  tissue  and  pulse  sequence  parameters. 
The  MRI  signal  Si  (which  is  the  deterministic  portion  of  the 
image  gray  level)  arising  from  a  region  with  tissue  specific 
parameters  N{H)  (proton  density)  and  relaxation  times  Tl 
and  T2  in  the  ith  image  is  given  by 


S,=N{H) 


n 

2y^f-iyei'^‘-^)TE-2TR/2Tl 
.  1=1 


+  g-rfl/ri 


g-iT£/T2 


(1) 


Equation  (1)  illustrates  that  for  a  particular  tissue,  the  signal  is 
a  decaying  exponential  function  of  the  image  number  i.  With 
die  additive  white  noise,  the  intensity  of  the  {j,  k)th  pixel  in 
d»  ith  image  Pj^i  can  be  represented  as 


Pjki  =  Mjke  ^2) 

*here  .V/,*.  is  a  function  of  Tl  and  N(H)  for  the  tissue  at 
^sition  ij.k)  as  well  etsTE  and  TR  of  the  pulse  sequence 
S'ven  in  (1).  and  ir,*.,  represents  white  Gau,ssian  noise. 


C.  Partial  Volume  Averaging 

In  this  section  and  throughout  the  rest  of  the  paper,  when¬ 
ever  it  is  possible,  we  present  definitions  and  explanations 
using  a  single  image.  This  is  done  for  the  purpose  of  notation 
simplicity.  The  concepts  can  be  easily  extended  to  multiple 
images. 

Here,  we  review  the  theoretical  model  for  the  MRI  signal 
generated  from  voxels  containing  a  mixture  of  multiple  tissues 
in  terms  of  the  signals  for  each  of  the  tissues.  The  MRI  signal 
5  from  a  voxel  containing  m  different  materials  is  given  by 
[19] 


where 

Vi  volume  of  the  (th  material  within  the  voxel 
V  total  volume  of  the  voxel 
Si  signal  from  the  (th  material. 

The  gray  level  Pjk  of  the  (j,  fc)th  pixel  (corresponding  to  the 
(j,  A:)th  voxel)  in  an  MR  image  is  given  by 

P,k  =  E[P,,]  -i.wjk  =  f^(^)si  +  w,,  (4) 

1=1  '  r 

where  V/j*  is  the  partial  volume  of  the  Ith  material  in  the 
{j,  A:)th  voxel,  and  Wjk  represents  statistical  noise  that  is  again 
assumed  to  be  an  additive  zero-mean  white  Gaussian  noise 
field  with  standard  deviation  a.  Note  that  E[Pjk]  is  deter¬ 
ministic  but  unknown,  whereas  the  noise  Wjk  is  stochastic; 
therefore,  the  pixel  gray  level  Pjk  is  the  sum  of  a  deterministic 
value  (to  be  estimated)  and  noise.  We  use  the  notation  E[Pjk] 
to  denote  the  deterministic  value  of  the  pixel  gray  level. 

Preserving  partial  volume  averaging  information  means 
that  the  deterministic  portion  of  the  image  gray  levels  is 
maintained,  whereas  the  stochastic  portion  is  suppressed  by 
the  restoration  filter. 

D.  Signal-to-Noise  and  Contrast-to-Noise  Ratios 

For  medical  image  interpretation,  one  is  normally  interested 
in  clearly  visualizing  a  specific  feature  (tissue  type),  which  is 

-This  IS  sometimes  misleadingly  called  spectral,  which  is  taken  from  remote 
sensing  terminologies. 
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referred  to  as  the  desired  feature.  To  assess  the  image  laiity. 
one  thus  may  consider  the  SNR  of  this  desired  feature  iS.NR,/). 
which,  in  the  .VIRI  literature,  is  detined  as 


SNR, I 


v/VanP^.) 


(5) 


where  is  the  expected  value  of  the  (j.  fc)th  pixel  gray 

level  in  a  region  of  interest  drawn  over  the  desired  feature, 
which  is  called  the  desired  ROI  (DROI),  and  ^ Var( )  is  the 
standard  deviation  of  the  noise  in  the  DROI,  Since  statistical 
noise  in  MRI  is  ergodic  and  uncorrelated  with  the  signal  [5], 
[6],  and  Var(P/j  in  a  homogeneous  region  of  .V  pixels 

can  be  estimated  using  the  sample  mean  and  variance  [20] 


J.k 


J.k  \  J.k  J 


(6) 


(7) 


The  estimated  values  E[Pji^\  and  V^(Pj^)  are  then  inserted 
into  (5). 

To  derive  an  analytic  expression  for  the  SNRj  of  a  com¬ 
posite  image,  the  following  common  assumptions  [1],  [4], 
[21H23]  are  used; 

1)  Statistical  noise  in  MRI  is  modeled  as  a  Gaussian 
distributed  zero-mean  white  noise  field  with  standard 
deviation  a. 

2)  Signature  vectors  are  a  priori  known  fairly  well. 

Then,  the  standard  formula  for  noise  propagation  [24]  shows 
that  SNRj  of  a  linearly  transformed  image  (e.g.,  the  eigenim- 
age)  with  the  weighting  vector  w  is  simplified  to  [4] 


SNRj  = 


w  ■  d 


a{w  ■ 


(8) 


An  estimate  of  is  found  by  (7)  using  image  gray  levels 
in  the  DROI  or  those  in  a  background  ROI.  When  using 
magnitude  reconstructed  images,  the  standard  deviation  found 
from  the  background  ROI  needs  to  be  divided  by  0.655  to 
yield  the  noise  standard  deviation  [25], 

The  CNR  between  the  desired  feature  (tissue)  and  the 
ith  interfering  feature  (background  or  another  tissue  type), 
which  is  denoted  by  CNR^,  is  usually  more  important  than 
the  SNRrf  for  medical  image  interpretation.  It  quantifies  the 
distinguishability  of  the  desired  feature  from  the  interfering 
features.  The  CNR^  is  defined  as  [21] 


CNR.  = 


E[Pf,]-E[P^^] 


IV^iPf,)  +  VaTiP^^) 
2 


(9) 


where  is  the  gray  level  of  the  ij.  A  )th  pixel  in  the  undesired 


ROI  (UROI),  and  E[P^f^]  and  Var(P^“j.)  are  the  mean  and  the 
variance  of  pixel  values  in  the  UROI,  respectively. 

P.  Eif^enima^e  Filter 

The  eigenimage  filter  maximizes  the  projection  of  a  desired 
feature  while  minimizing  the  projections  of  the  undesired 
(interfering)  features  in  a  composite  image  called  an  eigen¬ 
image  (El)  [2].  It  has  been  shown  that  it  maximizes  the  SNR,^ 
while  correcting  for  partial  volume  averaging  effects  [4],  [26]. 
Since,  in  the  eigenimage.  the  desired  feature  appears  bright, 
the  interfering  features  appear  dark,  and  the  partial  volumes 
of  the  desired  feature  are  visualized.  Viewing  the  eigenimage 
would  be  helpful  for  a  better  interpretation  of  the  .MRI  scene. 
Fig.  3  illustrates  the  application  of  the  eigenimage  filter  for 
segmenting  gray  matter  from  white  matter  and  cerebrospinal 
fluid  (CSF)  in  brain  images. 

Mathematically,  the  eigenimage  is  a  weighted  sum  of  the 
images  in  the  sequence.  A  pixel  in  the  eigenimage  is  therefore 
a  linear  combination  of  all  pixels  at  the  same  location  in  the 
MR  images,  i.e. 


Eljk  =  ^  eiPjki  =e  Pj 


jk 


(10) 


i=l 


where  Eljk  is  the  gray  level  of  the  (ifc)th  pixel  in  the 
eigenimage,  and  e  =  [ei  62  •  e„]^  is  the  weighting  vector 

to  be  determined. 

To  determine  the  weighting  vector,  the  SNR^  is  maximized 


Max. 


SNR,  = 

a(e  ■  e)^'^ 


subject  to  the  constraint  that 

e  •  tii  =  0,  for  i  =  1,  • 


•,771. 


(11) 


(12) 


The  solution  to  the  above  constrained  optimization  problem  is 
given  by  [4],  [26] 

e  =  d-I  (13) 

where  ^  is  the  projection  of  d  onto  the  subspace  spanned 
by  {ttj,  i  =  l,---,7n}  and  can  be  computed  using  a 
Gram-Schmidt  orthogonalization  procedure.  The  solution  is 
always  nonzero  unless  d  is  linearly  dependent  on  {ui,  i  = 
1,  •  •  • ,  m},  which  is  very  unlikely  to  happen  in  practice. 

rv.  Relation  with  Similar  Filters 

Before  presenting  the  new  image  restoration  filter  in  Section 
rv,  we  explain  the  relationship  of  this  new  filter  to  the  gen¬ 
eralized  order  statistic  filters  and  to  the  anisotropic  diffusion 
and  adaptive  smoothing  filters. 

A.  Generalized  Order  Statistic  Filters  [I0J-[15} 

For  this  class  of  filters,  the  data  points  in  a  neighborhood 
from  a  single  image  are  ordered.  A  fixed  percentage  of  the 
upper  and  lower  ends  of  the  ordered  data  is  ignored  as 
outliers,  and  the  sample  average  of  the  remaining  data  points 
is  used  as  an  estimate  for  the  point  in  the  center  of  the 
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Image  Image 

No.  1  No.  2 


Tissue  type  No.  1 
Background 
Tissue  type  No.  2 

A  pixel  from  tissue  No. 
The  pixel  in  the  center 
A  pixel  from  bakground 
A  pixel  from  tissue  No. 
Tissue  type  No.  3 


2 

3 


Image 
No.  n 


Fig.  1-  Graphic  illustration  of  an  MRI  scene  sequence  and  the  use  of  a  neighborhood 
or  temporal)  information. 


to  access  both  of  intraframe  (spatial)  and  interframe  (parameinc 


neighborhood.^  The  new  filter  shares  the  idea  of  trimming  data 
with  this  class  of  filters.  It,  however,  considers  a  set  of  images 
simultaneously,  rather  than  a  single  image.  It  neither  orders  the 
data  points  nor  fixes  the  percentage  of  the  data  to  be  ignored. 
Instead,  it  uses  the  Euclidean  distance  as  a  similarity  measure 
and  discards  any  pixel  vector  whose  Euclidean  distance  is 
larger  than  a  specific  threshold  value  (this  is  adopted  from 
the  optimal  solution  to  a  binary  detection  problem  [17]).  The 
threshold  is  selected  based  on  the  probabilities  of  detection  and 
false  alarm,  as  explained  in  Sections  V-B  and  C.  A  detailed 
analysis  of  the  threshold  selection  is  a  unique  feature  of  this 
paper. 

B.  Anisotropic  Diffusion  and  Adaptive  Smoothing  [8],  [9J 

These  filters  consider  a  single  image  and  a  3  x  3  neigh¬ 
borhood.  They  use  a  nonlinear  function  of  the  magnitude  of 
the  gradient  vector  at  each  point  in  the  neighborhood  as  the 
weighting  factor  for  smoothing.  The  new  filter  shares  the  idea 
of  adapting  the  weighting  factor  to  the  characteristics  of  the 
signal  at  each  point  of  the  neighborhood  with  these  filters.  It, 
however,  considers  a  set  of  images  simultaneously,  and  uses  a 
simple  nonlinearity  (a  step  function)  along  with  the  Euclidean 
distance  as  a  measure  of  similarity.  It  should  be  noted  that  there 
is  little  advantage  in  using  a  smooth  function  in  determining 
the  weighting  factors  for  nonlinear  filtering,  as  opposed  to  that 
for  linear  filtering  that  avoids  frequency  domain  sidelobes.  The 
main  advantage  of  nonlinear  filtering  is  that  it  discriminates 
between  high-frequency  components  of  signal  and  noise.  For 
nonlinear  filtering,  the  choice  of  a  step  function  has  two 
advantages:  It  lessens  the  computational  load  and  makes  it 
simple  to  optimize  the  nonlinearity  (i.e.,  its  threshold)  based 
°n  the  probabilities  of  detection  and  false  alarm.  Using  the 
Euclidean  distance  discriminator  has  a  major  advantage  of 
it  possible  to  use  either  a  small  neighborhood,  e.g., 

X  3,  or  a  larger  one,  e.g.,  a  9  x  9.  This  is  impossible  with 
gradient  vector  since  it  makes  no  distinction  between  flat 
•  omogeneous)  regions  of  two  different  objects  that  may  be 
Pmsent  in  a  large  neighborhood. 

^  50%  trimming  yields  the  median  filter. 


V.  Proposed  New  Image  Restoration  Filter 

The  new  filter  uses  both  intra-frame  (spatial)  and  inter-frame 
(parametric  or  temporal)  information  to  suppress  the  additive 
noise  in  MR  images,  while  preserving  and  enhancing  edge  and 
partial  volume  information.  Details  of  the  proposed  filter  are 
as  follows. 

First,  we  consider  a  neighborhood  centered  on  the  pixel  to  be 
estimated  (see  Fig.  1);  the  size  and  shape  of  this  neighborhood 
depends  on  the  size  and  shape  of  the  objects  in  the  scene  as 
well  as  the  allowed  computation  time.  Since  there  is  usually 
no  a  priori  knowledge  regarding  the  shapes  of  the  objects  in 
the  scene,  we  always  use  a  square  neighborhood.  The  size 
of  this  square  neighborhood  is  mainly  limited  by  the  allowed 
computation  time.  We  normally  use  a  9  x  9  neighborhood  for 
which  the  computation  time  for  256  x  256  images  is  about  4 
min  on  a  Sun  SPARCstation  2.  Use  of  a  neighborhood  larger 
than  9  X  9  (81  pixels)  would  usually  improve  perfoimaiKe,  but 
the  amount  of  computation  required  was  judged  to  make  such 
choices  infeasible  for  present  compuutional  power  available 
in  typical  image  analysis  laboratories. 

Second,  we  implement  the  optimal  solution  to  a  binary 
detection  problem  [17].  That  is,  we  calculate  the  Euclidean 
distaiKe  between  each  pixel  vector  in  the  neighboiiiood  and 
the  pixel  vector  in  the  center,  which  is  the  pixel  vector  to  be 
estimated.  If  this  distance  is  smaller  than  a  specific  threshold 
value  T),  we  consider  that  pixel  vector  in  the  estimate  of  the 
pixel  vector  in  the  center,  if  this  distance  is  greater  than  rj, 
that  pixel  vector  is  not  used  in  die  estimate.  The  threshold  77 
depends  on  the  noise  standard  deviation  <7  in  the  images,  the 
contrast  between  adjacent  ROI's,  and  partial  volume  averaging 
effects  that  are  reflected  in  the  smoothness  of  the  edges  in  each 
image.  In  practice,  rf  is  chosen  based  on  the  probabilities  of 
detection  and  false  alarm  (as  explained  in  Scions  V-B  and 
C),  which  depend  on  all  of  these.  There  is  a  tradeoff  between 
suppression  of  statistical  noise,  preservation  of  the  average 
partial  volume  information,  and  enhancement  and  sharpening 
of  image  edges,  which  is  manifested  as  the  tradeoff  between 
the  probabilities  of  detection  and  false  alarm. 

Third,  we  consider  two  methods  of  computing  an  estimate 
for  the  contributing  pixel  vectors:  1)  just  the  sample  averaging 
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Fig.  2.  Flowchart  of  the  MLE  +  AVG  and  AVG  restoration  filters  for  an  image  size  of  .V  x  .V  and  a  neighborhood  size  of  iVS  x  .V5. 


of  the  contributing  pixel  vectors  and  2)  a  maximum  likelihood 
estimation  (as  explained  in  Section  V-A)  in  addition  to  the 
sample  averaging.  The  resulting  filters  are  called  AVG  and 
MLE  +  AVG.  respectively.  The  MLE  +  AVG  version  is 
expected  to  have  a  superior  performance  than  the  AVG  version 
since  it  uses  interffame  as  well  as  intraframe  information. 
However,  it  is  not  applicable  if  the  MRI  signal  model  is 
inaccurate  due  to  the  specific  choice  of  the  imaging  and  pulse 
sequence  protocols  and  parameters. 

Fourth,  we  move  to  an  adjacent  pixel,  i.e.,  shift  the  neigh¬ 
borhood,  and  repeat  the  procedure.  Finally,  we  calculate  the 
average  of  several  estimates  obtained  for  a  particular  pixel 
vector  to  find  the  filter  output  for  that  pixel  vector.  A  flowchart 
explaining  both  versions  of  the  filter  is  given  in  Fig.  2. 

Note  that  the  last  averaging  step  uses  appropriate  pixels 
(with  Euclidean  distances  less  than  tj)  outside  a  neighborhood 
but  inside  a  shifted  neighborhood  (with  shifts  of  at  most  half 
of  the  neighborhood  size  in  each  direction)  in  estimating  the 
pixel  vector  in  the  center  of  the  first  neighborhood.  This 
allows  contributions  from  the  appropriate  pixels  in  a  larger 
neighborhood  encompassing  quadruple  number  of  pixels  with 
the  addition  of  a  small  computational  load.  Considering  this 
effective  neighborhood  size,  the  upper  bound  on  the  SNR 
improvement  factor  for  the  AVG  version  of  the  new  filter  is 
2.V.9.  using  a  .VS  x  .VS  neighborhood. 


A.  Approximate  Maximum  Likelihood  and 
Least  Squares  Estimates 


Maximum  likelihood  and  least-squares  are  two  well-known 
estimation  criteria.  For  the  Gaussian  pixel  model  in  (2),  the 
maximum  likelihood  estimate  coitKides  with  the  least-squares 
estimate  [27],  Computation  of  the  optimal  estimator  requires 
solving  a  nonlinear  system  of  equations,  which  is  computation¬ 
ally  intense.  We  therefore  consider  die  following  suboptimal 
estimator  of  the  gray  levels.  We  perform  an  approximation 
by  factoring  the  signal  out  of  (2)  and  then 

taking  its  natural  logarithm,  yielding 


HPjki)  =  ln(Mjfce-’^®/”>*) 


•In  ^1-1- 


Wjki 


Wjki 


,  .  iTE 

=  In(Mjfc)  -  +  wsjki 

~  ttjk  “F  ~F  tVSjki^ 


(14) 


In  simplifying  (14),  we  used  the  Taylor  series  expansion 


fc=i 


(15) 
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j  neglected  x*  and  higher  order  terms.  This  is  reasonable 
jince  usually,  /wjki  »  1,  and  thus,  x  = 

:  ^g->r£:/T2jt  j fvjQjg  wsjki  is  also  zero-mean  and 

Gaussian  distributed  since  it  is  a  scaled  version  of  Wjk,- There- 
[ore.  the  maximum  likelihood  estimate  of  ln(PjA:.)  is  identical 
to  a  weighted  least  squares  estimate  for  the  line  ajk  -t-  bjki. 
W’eiahts  for  the  least  squares  estimate  are 
^[„ch  we  approximate  by  Pjk,  (see  (2)). 


g  Probability  of  Detection 

In  the  selection  of  the  threshold  t},  we  consider  the  proba¬ 
bilities  of  detection  Pq  and  false  alarm  Ppf  The  probability 
of  detection  Pq  is  the  chance  of  correctly  identifying  a  pixel 
vector  Pjk  tn  the  neighborhood  that  represents  the  same  tissue 

(ype  as  the  pixel  vector  in  the  center.  These  pixel  vectors 
3IC  assumed  to_be  uncorrelated  and  Gaussian  distributed,  with 
n^an  veijor  d  ajd  cov^ance  matrix  The  difference 
vector  Djk  =  Pim  ~  Pjk  is  therefore  Gaussian  distributed 
with  mean  vector  o  and  covariance  matrix  2cr^/.  The  square  of 
(tie  Euclidean  distance  EOj,^  =  H-Dy^lp  between  these  pixel 
vectors  has  a  scaled  chi-squaied  distribution  widi  n  degrees 
of  freedom,  mean  of  2n<T^,  and  variance  of  Sner^  [28].  Using 
the  threshold  value  rj  for  the  Euclidean  distance  EDjk  yields 
the  following  probability  of  detection  Pq: 


rrp 

Pd=  fx{x)dx 

Jo 


'«r(2) 


2,n/2-lg-x/2^ 


(16) 


TABLE  I 

Threshold  Values  ini  and  Probabilities  of  Detectio'.  .  ,  4,sd  F^LSE 

Alarm  (Pf  I  for  Simulation  (Brain),  using  Whute  Matter  wd 
Gray  .Matter  Sicnatl're  Vectors,  and  Several  Percentages  .jf 
Partial  Volumes:  a  =  0%;  6  =  25%:  r  =  50't  it  =  8.751 


1 

Pd 

Pp{a) 

Fx(b) 

Pf  !  c  I 

1.459(7 

0.100 

0.0000 

0.0000 

0.00.35 

1.816(7 

0.200 

0.0000 

0.0002 

0.0086 

2.095(7 

0.300 

0.0000 

0.0004 

0.0167 

2.347(7 

0.400 

0.0000 

0.0009 

0.0275 

2.590(7 

0,500 

0.0000 

0.0017 

0.0440 

2.844(7 

0.600 

0.0000 

0.0031 

0.0675 

3.124(7 

0.700 

0.0001 

0.0053 

0.1028 

3.460(7 

0.800 

0.0001 

0.0103 

0.1586 

3.674(7 

0.850 

0.0002 

0.0157 

0.20.38 

3.944(7 

0.900 

0.0006 

0.0259 

0.2675 

4.356(7 

0.950 

0.0017 

0.0516 

0.3831 

4.721(7 

0.975 

0.0035 

0.0876 

0.4925 

5.153(7 

0.990 

0.0082 

0.1503 

0.6212 

for  the  square  of  the  ith  component  of  the  difference  vector 
divided  by  y/2a  is  obtained  as^ 


fY,iy)  = 


1 

2%3qTt2iry 

y>o, 


Je-(yi/-"»i)^/2 

1  <  i  <  n, 


e-(\/5+'".)^/2|^ 

(17) 


where  is  the  ith  component  of  the  difference  vector  m 
divided  by  \/2a.  The  pdf  fviy)  for  the  Euclidean  distance 
squared  EDji^  divided  by  \/2<t  is  found  by  convolving  n 
pdfs  given  in  (17),  i.e. 


/v(-)  =  fvA-)  *  fYii-)  *  ■  ■  ■  *  frA-)-  (18) 


Equation  (16)  illustrates  that  Pp  is  directly  proportional  to 
the  ratio  of  77  to  a:  increasing  77  or  decreasing  a  increases  Pq- 
The  value  of  a  is  dictated  by  the  MRI  instrumentation  and 
acquisition  parameters  that  are  fixed  from  this  paper’s  point 
of  view.  Thus,  we  investigate  the  relationship  between  Pp  and 
17.  To  make  the  results  independent  of  the  numerical  value  of 
a,  we  find  Pd  for  77  being  equal  to  a  multiple  of  a.  Table  I 
lists  the  coiresponding  threshold  values  for  several  Pd'&  when 
using  a  sequence  of  four  multiple  spin-echo  images. 

C.  Probability  of  False  Alarm 

The  probability  of  false  alarm  Pf  is  the  chance  of  wrongly 
classifying  a  pixel  vector  which  repiesents  a  different 
tissue,  as  one  that  represents  the  same  tissue  as  that  of  the  pixel 
vector  in  the  center  of  the  neighborhood.  As  before,  these 
Pwel  vectors  are  assumed  to  be  uncorrelated  and  Gaussian 
tlistributed  with  identic^  covariance  matrices  <t^I  but  different 
vecton  tt  and  d,  respectively.  The  difference  vector 

Ti  nr 

js  ~  Mm  “  Pjk  is  then  Gaussian  distributed  with  mean 
''«cU)r  m  =  j  -  u  and  covariance  matrix  2<t*/.  A  scaled 
I'^ion  of  the  square  of  the  Euclidean  distance  between 
^  pixel  vectors,  i.e.,  ED%/2a^  =  \\D^^\\y2<T^,  has  a 
’^ncenn-al  chi-squared  distribution.  Using  standard  techniques 
deriving  probability  density  functions  (pdfs)  [20],  the  pdf 

is  ideal  to  have  a  Pn  equal  to  1.0  and  a  Pf  equal  to  0.0. 


Finally,  the  probability  of  false  alarm  Pf  is 

rrp/2ir'^ 

Pf=  fY{y)dy.  (19) 

Jo 

Equations  (17)  to  (19)  show  that  Pp  is  related  to  the 
EDjkja  in  addition  to  q/tr.  That  is,  both  the  CNR’s  of 
the  MR  images  and  the  threshold  value  determine  Pp.  The 
CNR’s  are  dictated  by  the  MRI  instrumentation  and  acquisition 
parameters  as  well  as  the  tissues  in  the  scene,  both  of  which  are 
fixed  from  this  paper’s  point  of  view;  we  focus  on  the  multiple 
spin-echo  images  of  the  brain.  We  therefore  investigate  the 
relationship  between  Pp  and  77.  Again,  to  make  the  results 
independent  of  the  numerical  value  of  <7,  we  find  Pp  for  77 
being  equal  to  a  multiple  of  a. 

Table  I  shows  the  corresponding  Pp's  for  several  threshold 
values  using  the  human  brain  images,  considering  different 
percentages  of  partial  volumes  between  white  matter  and 
gray  matter.  For  (a),  it  is  assumed  that  pixel  vectors  in  a 
neighborhood  correspond  to  voxels  containing  either  pure 
white  matter  or  pure  gray  matter.  For  (b),  it  is  assumed 
that  pixel  vectors  in  a  neighborhood  correspond  to  voxels 
containing  either  pure  white  matter  or  a  combination  of  25% 
white  matter  and  75%  gray  matter.  Similarly  for  (c),  pixel 

^Division  by  >/2cr  simplifies  the  derivation  and  results  in  a  standard  knovAn 
distribution,  i.e..  noncentral  chi-squared. 
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Fig.  3.  Illustration  of  the  eigenimage  filtering  using  four  multiple  spin-echo  T2-weigfated  images  (TE/TR  =  25-10(V2500  ms)  and  inversion  recovery 
image  (TE/TI/TR  =  20/600/1500  ms)  of  a  normal  human  brain.  Original  images  with  the  desired  and  undesired  ROI’s  are  shown  to  the  left,  aial  the 
gray  matter  eigenimage  is  shown  to  the  right.  Signature  and  weighting  vectors  are  graphed  at  the  bottom.  For  the  mathemarinai  basis  of  the  eisenimace 
filtering,  see  Section  ffl-E. 


vectors  corresponding  to  voxels  containing  either  pure  white 
matter  or  a  combination  of  50%  white  matter  and  50%  gray 
matter  are  considered.  For  each  case,  the  Euclidean  distance 
between  the  mean  vectors  from  the  two  categories  is  calculated 
and  used  in  the  calculation  of  the  corresponding  Pp.  We 
do  not  need  to  consider  the  partial  volumes  between  white 
matter  and  CSF,  or  gray  matter  and  CSF  since  for  these  cases, 
the  Pp's  are  smaller  than  those  in  Table  I.  This  is  because 
the  Euclidean  distances  between  the  corresponding  signature 
vectors  are  larger. 

VI.  Application  and  Evaluation 

We  have  applied  the  new  restoration  filter  as  a  preprocessing 
step  for  eigenimage  filtering  [2H4],  [26],  Fig.  3  is  a  schematic 
representation  of  eigenimage  filtering.  Four  multiple  spin-echo 
and  one  inversion  recovery  image  are  shown  to  the  left.  The 
regions  of  interest  for  white  matter,  gray  matter,  and  CSF  are 
also  shown.  The  eigenimage  generated  by  taking  gray  matter  as 
the  desired  feature,  and  white  matter  and  CSF  as  the  interfering 
features,  are  shown  to  the  right.  Here,  the  quality  (CNR)  of  the 
eigenimage  is  good  since  in  addition  to  four  multiple  spin-echo 
images,  an  inversion  recovery  image  is  also  used.  The  white 
matter  eigenimage  using  images  1-5  in  Fig.  3  is  shown  in 


Fig.  5(i).  However,  using  only  four  multiple  spin-echo  images, 
the  eigenimage  for  white  matter  (shown  in  Fig.  5(e))  is  so 
noisy  that  no  structure  can  be  seen  in  it  In  this  case,  use 
of  the  new  restmation  filter  as  a  preprocessing  step  before 
the  eigenimage  filtering  has  generated  a  highly  improved 
eigenimage,  which  is  shown  in  Fig.  5(h).  This  eigenimage 
shows  the  white  matter  structure  clearly.  It  has  segmented 
the  desired  feature  (white  matter)  from  die  interfering  features 
(gray  matter  and  CSF)  with  a  high  CNR. 

A.  Examples 

For  experimental  evaluation  of  the  new  filter  as  a  inepro- 
cessing  filter  and  comparing  it  with  several  conventional  {He  or 
postprocessing  filters,  we  have  used  MRI  scene  sequences  of  a 
computer  simulation  and  a  human  brain.  Each  image  sequence 
consists  of  four  multiple  spin-echo  images  with  TC/TR  of  25, 
50,  75,  and  100/2500  ms.  The  simulated  scene  consists  of 
several  elliptic  regions  with  different  sizes  (ellipses  on  the 
top  simulate  white  matter,  and  those  on  die  left  simulate  gray 
matter)  and  a  square  region  (the  central  square  simulates  white 
matter,  and  L-shaped  regions  on  the  sides  simulate  CSF  and 
gray  matter)  to  illustrate  effects  of  each  filter  on  objects  with 
different  shapes  and  sizes.  Partial  volume  regions  between 
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Rj.  4.  (aHd)  Four  simulated  spin-echo  T2-weighted  (TE/TR  =  25-100/2500  ms)  MR  inuses.  The  scene  consists  of  12  eUipses.  with  different  sizes  and 
orientations,  and  a  square  with  three  overlapping  regions;  ellipses  at  the  top  and  central  legion  of  the  squase  repiesett  white  matter,  ellipses  on  the  left 
ad  the  top  and  right  strips  of  the  square  represent  gray  matter,  and  the  left  and  bottom  strips  of  the  square  represent  CSF.  Voxels  corresponding  to  the 
pitels  in  the  overlapping  regions  are  assumed  to  contain  convex  combinations  of  the  overlapping  tissues,  bnages  are  windowed  together  to  visualize  the 
eiponendal  decaying  behavior  of  the  MRI  signal  in  each  ROI;  (e)-(h)  eigenimages  generated  for  the  central  region  using  the  images  shown  in  the  first  row 
aid  their  preprocessed  versions  by  MLE,  AVG,  and  MLE  +  AVC  filters  using  a  threshold  value  of  4rr  (<r  s  g.7S)  and  a  neighborhood  size  of  9  x  9, 
respectively:  (i)  eigenimage  generated  using  noiseless  original  images;  (jHk)  eigenimages  generated  using  median  and  median-AVG  preprocessed  images; 
ilMp)  post-processed  eigenimages  by  low-pass  (smooth  3-1  and  G-9.  as  explained  in  Section  VI-C),  AVG,  median,  and  median-AVG  filters,  respectively. 


white  matter  and  gray  maner,  and  between  white  matter  and 
CSF,  are  located  on  top  and  on  the  right-hand  sides  of  the  cen¬ 
tral  square  and  on  the  left  and  bottom  sides  of  it,  respectively. 
Original  images  for  the  simulation  and  brain  are  shown  in 
Figs.  4<a)-(d)  and  5(a)-{d),  respectively.  Eigenimages  widiout 
any  pre  or  postprocessing  are  shown  in  Rgs.  4(e)  and  5(e), 
lespectively.  Eigenimages  generated  after  the  application  of 
the  MLE  -I-  AVG  filter  as  a  preprocessing  step  are  shown  in 
Figs.  4(h)  and  5(h).  Note  the  great  improvement  in  the  quality 
(CNR)  of  these  eigenimages. 


*  Effects  of  Neighborhood  Size  and  Threshold  Value 

To  illustrate  the  effects  of  neighbortiood  size  and  thresh- 
“W  value,  we  have  generated  several  eigenimages  for  each 
These  eigenimages  are  shown  in  Figs.  6  and  7. 
^^iborhood  sizes  for  images  in  the  first,  second,  third,  and 
'«mh  rows  are  3  X  3,  5  X  5,  7  X  7,  and  9  X  9,  respec- 
Threshold  values  for  the  first,  second,  third,  and  fourth 
^ojuinns  are  Icr  (cr  =  8.75),  2cr,  Aa,  and  So,  respectively. 
^  Is  II  lists  CNRi  (CNR  between  the  desired  feature  and 
interfering  feature)  and  CNR2  (CNR  between  the  desired 


feature  and  second  interfering  feature)  for  each  example,  using 
several  neighborhood  sizes  and  dseshold  values. 

It  is  seen  diat  by  increasing  the  duesbold  value  from  \a 
to  4<7,  the  CNR’s  of  tte  eigenimage  rapidly  increase,  but 
further  increments  (ff  the  tfareshiM  value  do  not  cmisiderably 
increase  these  (TNR’s.  This  is  because  increasing  the  threshold 
increases  the  probability  of  detectioi),  which  reaches  0.90 
at  4<7  (95%  of  possible  improvement  is  achieved  at  this 
threshold  level  since  (TNR  is  propatkifial  to  the  square  root 
of  the  number  of  pixels  averaged).  Fiiitfaer  increments  of 
the  direshold  value  may  slightly  inqirove  die  CNR’s  at  the 
cost  of  increasing  the  probability  of  ftdse  alarm  (inclusion  of 
partial  volume  pixels  that  smooth  edges  and  Uur  the  whole 
image,  as  visualized  Iq/  eigenimages  ip  the  third  and  fourth 
columns  of  Figs.  6  and  7).  It  is  also  observed  that  increasing 
the  neighborhood  size  improves  the  CNR.  This  is  because 
increasing  the  neighborhood  size  increases  the  number  of 
pixels  in  each  averaging  step.  This  improvement  is  achieved 
at  the  price  of  increasing  the  computation  time.  Using  a  Sun 
SPARCstation  2,  the  computation  time  for  3  x  3,  5  x  5,  7  x  7, 
and  9x9  neighborhoods  are  approximately  0.4,  1.2.  2.5.  and 
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Fig.  5.  (a)— (d)  Four  spin-echo  T2-weighted  (TE/TR  =  2S— 100/2500  ms)  MR  images  of  a  human  brain.  Images  are  windowed  together  to  visualize  the 
exponential  decaying  behavior  of  the  MRI  signal  in  each  ROI;  (e)-(h)  eigenimages  generated  for  the  white  matter  using  the  images  shown  in  the  first 
row  and  their  preprocessed  versions  by  MLE.  AVG,  and  MLE+AVO  filters  using  a  threshold  value  of  4<r  (<r  =  8.75)  and  a  neighborhood  size  of  9  x 
9,  respectively;  (i)  eigenimage  generated  using  the  multiple  spin-echo  images  in  (a>-(d)  plus  an  inversion  recovery  image;  (j)-{k)  eigenimages  generated 
using  median  and  median-AVG  preprocessed  images;  (IHp)  postprocessed  eigenimages  by  low-pass  (3-1  and  G-9,  as  explained  in  Section  VI-C),  AVG 
median,  and  median-AVG  filters,  respectively,  '  ’ 


4.0  min,  respectively.  Since  the  filter  is  highly  parallelizable, 
use  of  a  parallel  processing  computer  will  significantly  reduce 
the  computation  time.  A  Kalman  filtering  implementation  of 
the  MLE  will  also  reduce  the  computation  time. 

Details  of  these  implementations  are  beyond  the  scope  of 
this  paper.  However,  since  the  Kalman  filtering  has  been 
conventionally  applied  to  image  restoration  in  a  totally  dif¬ 
ferent  context,  a  brief  explanation  of  our  intended  use  seems 
beneficial.  For  the  proposed  restoration  filter,  the  MLE  finds 
the  minimum  mean  square  fit  to  the  I -D  signals  (pixel  vectors). 
An  advantage  of  the  Kalman  filtering  implementation  of  the 
minimum  mean  square  solution  is  that  it  is  faster  than  the 
original  implementation  [17].  A  simple  explanation  for  its 
speed  is  that  at  each  point,  it  uses  only  the  previous  data  points 
instead  of  using  all  of  the  data  points.  This  increases  the  speed 
in  a  tradeoff  with  performance. 

Images  in  Figs.  6-7,  CNR’s  in  Table  11,  and  Pq's  and  Pp's 
in  Table  I  illustrate  that  for  a  multiple  spin-echo  sequence  with 
four  echoes,  a  threshold  value  of  4(r  and  a  neighborhood  size 
of  9  X  9  are  good  choices.  For  this  threshold  level,  about  90% 
of  the  pure  pixels  form  the  same  tissue  and  only  about  10% 
(on  average)  of  those  with  less  than  50%  partial  volume  of 


the  same  tissue  are  used  in  each  averaging  step  of  the  filter. 
These  parameters  were  used  to  generate  the  results  shown  in 
Figs.  4<g)-<h),  5(g)-<h),  and  8(cHd). 

C.  Comparison  with  Conventional  Pre 
or  Postprocessing  Filters 

Several  other  pre  or  postprocessing  filters  (some  nonlinear 
and  some  linear)  are  also  applied  to  die  above  examples. 
Nonlinear  filters  a{^lied  are  vector  median  and  vector  median- 
average  filtering  [29]  of  the  original  images  and  median  and 
median-average  filtering  of  the  eigenimages  with  a  9  x  9 
neighborhood.  Linear  filters  applied  are  low-pass  filtering  (spa¬ 
tial  smoothing)  with  an  impulse  response  of  zero  everywhere 
except  for  a  square  of  3  x  3  centered  at  the  origin  where 
it  is  one  (smooth  3-1)  or  except  for  a  square  of  9  x  9, 
where  it  is  proportional  to  a  2-D  zero-mean  Gaussian  pdf 
with  a  variance  of  4  (smooth  G-9)  of  the  eigenimages.  The 
variance  of  the  smooth  G-9  filter  was  selected  to  implement 
an  example  of  low-pass  filtering  with  no  significant  frequency 
domain  side  lobes.  The  Gaussian  kernel  decreases  from  a 
maximum  value  of  one  at  the  center  of  the  neighborhood  to 
0.13  at  the  edge  centers  and  0.02  at  the  comers  of  the  9  x  9 


I 
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Fig.  6.  Eigenimages  generated  for  the  central  region  of  the  simulation  (which  simulates  white  miner)  aflcr  preprocessing  (MLE  +  AVO)  of  the  orieinal 
images  with  several  neighborhood  sizes  and  threshold  values.  Neighborhood  sizes  for  images  in  the  first,  second,  thhd.  and  founh  rows  are  3  x  3  5  x  5  7 
X  7,  and  9  X  9,  respectively.  Threshold  values  for  the  first,  second,  third,  and  fourth  columns  ate  Itr  (cr  =  8.75).  2<r.  4<r.  and  8<t.  respectively. 


neighborhood.  Since  eigenimage  filter  is  linear,  preprocessing 
of  the  original  images  by  a  linear  filter  is  equivalent  to 
postprocessing  of  the  eigenimage  by  the  same  filter,  except  that 
preprocessing  is  computationally  more  intense.  We  therefore 
consider  postprocessing  of  the  eigenimages  by  linear  filters. 
The  resulting  eigenimages,  using  the  above  filters,  are  shown 
in  Figs.  4(iHp)  and  5(jHp). 

The  MLE  +  AVG  filter  with  a  neighborhood  size  of  1  x 
1  (MLE  since  no  averaging  is  performed  using  this  neighbor¬ 
hood)  as  well  as  without  performing  MLE  (AVG)  are  also 
applied  to  each  example.  These  runs  show  the  effects  of 
ignoring  intraframe  (spatial)  or  part  of  interframe  (parametric 
or  temporal)  information,  respectively.  The  corresponding 
eigenimages  are  shown  in  Figs.  4(f>-(g)  and  5(f>-(g).  The 
CNR’s  of  the  original  images,  restored  inuiges  (MLE  +  AVG), 
*nd  eigenimages  discussed  above  are  listed  in  TaUe  HI. 

From  eigenimages  in  Figs.  4  and  5  and  their  CNR’s  in 
Table  III,  it  is  seen  that  the  MLE  +  AVG  filter,  which  uses 
I’oth  interframe  and  intraframe  information,  outperforms  all 
of  the  other  methods.  For  the  above  examples,  it  generated  an 
improvement  in  the  CNR  by  a  factor  between  15  and  60  while 
preserving  and  enhancing  edges.  Moreover,  as  illustrated  by 
hrain  eigenimages,  other  filters  drop  out  inner  regions  of  white 
matter  with  partial  volume  averaging  effects.  (This  deficiency 
's  not  quantified  by  CNR  since  it  is  calculated  using  pure 


regions.)  Impacts  of  each  pre  or  postprocessing  filter  on  the 
edge  and  outer  regions  with  partial  volume  averaging  effects 
are  illustrated  in  the  simulation  example,  for  which  the  truth 
is  known  (the  noiseless  eigenimage  is  shown  in  Fig.  4(i)).  It 
is  clearly  seen  that  the  new  restoration  filter  has  preserved  all 
of  the  edges  in  each  ellipse,  regaidtess  of  its  size. 

D.  Effects  of  New  Filter  on  Edge  and 
Partial  Volume  Iirformation 

For  fiirtha’  illustration  of  preserving  edge  and  partial  volume 
information  by  the  AVG  version  of  the  new  restoration  filter 
while  supjvessing  the  additive  noise  (improving  die  SNR  and 
CNR),  we  have  performed  the  following  experiments. 

The  first  experiment  uses  a  1-D  signal  extracted  from  the 
simulation  study  by  considering  all  pixels  on  die  170di  row 
of  the  third  simulated  image  shown  in  Rg.  4(c).  The  extracted 
signals  from  the  noiseless  simniatioii,  from  die  simulation  with 
an  additive  zero-mean  white  Gaussian  noise,  firmn  the  restored 
image  using  MLE  ■¥  AVG  filter,  and  firom  the  restenred  image 
using  AVG  filter  are  shown  in  Rg.  8(a)-(d),  respectively.  They 
graphically  illustrate  that  the  additive  noise  is  suppressed, 
edges  are  preserved,  and  partial  volume  information  (slopes) 
is  restored. 

The  second  experiment  uses  an  MRI  scene  sequence  (four 
multiple  spin-echo  T2-weighted  images  with  TE/TR  =  25.  50. 


iivirtuc  rKUi-tiiliNU.  VUL.  >*.  >U  J.  PtBRLARY  199^ 


If. 

•  ..'w 

'  ^  s 

..’‘V  ' 

/ 

’  ■■'  A  . 

^  ■ 

'1  A 

jT  , 

f  ^  4 

>  t  ^4 

:  f  '  ' 

It 

.  -  ,  ■ 

.  /*  ■ 

W  -  ■>- 

.  ^ 

'*1. 

'  ML  '' 

’  * 

'  1#  N 

,  ^  fx 

V'a 

'  '  " 

/  > 

jr  . 

.  t  < 

.  f  0  % 

,  f  0  % 

T  ^ 

-  ^ 

i  Y."-’  "1 

.*  ■  X 

*  *  . 

✓ 

X' 

¥ 

-'i 

■'A  :• 

A  '' 

( 

A 

A 

!  > 

1  ^  0 

0  . 

0 

*  A 

■  '  V 

>  % 

> 

% 

y.  > 

‘•j, 

'  '  a  '• 

y 

'  f  - 

' 

'a 

/  '  A 

^  . 

^  /  * 

4 

»  A 

.4 

X. 

74 

> 

Fig.  7.  White  matter  eigenimages  generated  after  preprocessing  (MLE  +  AVO)  of  the  original  images  with  several  neighborhood  sizes  and  threshold  values. 
Neighborhood  sizes  for  images  in  the  first,  second,  third,  and  fourth  rows  are  3  x  3.  5  x  5.  7  x  7.  and  9  x  9.  respectively.  Threshold  values  for  the 
first,  second,  third,  and  fourth  columns  are  l<r  (<r  =  8.75).  2a.  4a.  and  8<r,  respectively. 


75,  100/2500  ms  and  a  spin-echo  T1 -weighted  image  with 
TE/TR  =  20/500  ms)  of  a  hard-boiled  egg  in  gelatin.  In  this 
experiment,  we  have  numerically  estimated  volumes  of  the  egg 
white  and  egg  yolk  using  the  eigenimage  filtered  MR  images^ 
with  and  without  the  application  of  the  AVG  restoration  filter. 
Due  to  the  zero-mean  property  of  the  additive  noise,  the 
estimated  volume,  using  many  pixels  in  each  slice  and  several 
slices  through  the  object,  should  be  close  to  the  true  volume 
unless  the  partial  volume  information  is  lost  by  the  restoration 
filter  while  attenuating  the  additive  noise.  Thus,  a  comparison 
between  the  estimated  volumes  using  the  original  images  with 
those  using  the  restored  images  may  serve  as  an  evaluation 
method  for  preserving  the  partial  volume  information.  Table  IV 
compares  these  estimated  volumes  with  their  actual  volumes, 
which  are  measured  from  egg  white  and  egg  yolk  water 
displacements.  Closeness  of  the  results,  widi  and  without  die 
application  of  the  restoration  filter,  indicates  the  preservation 
of  the  panial  volume  information. 

Table  IV  also  compares  the  estimated  volume  of  the  central 
region  in  the  simulation  using  noiseless,  noisy,  and  restored 
eigenimages.  Closeness  of  the  results  again  indicates  the 
preservation  of  the  partial  volume  information  by  the  AVG 
version  of  the  new  filter. 

*’See  (30)  for  a  detailed  explanation  of  the  volume  determination  using  the 
eigenimage  filter. 


vn.  Summary  and  Conclusion 

We  presented  the  development  and  an  application  of  a 
multidimensional  restoration  filter  for  MRI  scene  sequences. 
The  proposed  filter  uses  both  interfiame  (parametric  or  tempo¬ 
ral)  and  intraframe  (spatial)  information  to  filter  the  additive 
noise.  Its  performance  depends  on  two  sets  of  parameters:  1) 
data  parameters  such  as  noise  power,  CNR’s  between  tissue 
types,  and  the  number  of  images  in  the  sequence  and  2)  filter 
parameters  such  as  the  nei^borhood  shape  and  size  and  the 
threshold  value.  Typical  values  of  die  data  parameters,  which 
were  extracted  from  clinical  MRI  brain  studies,  were  used 
to  optimize  the  filter  parameters.  In  particular,  the  threshold 
value  was  optimized  based  on  the  probabilities  of  detection 
and  false  alarm.  Without  a  priori  knowledge  of  die  shapes  of 
the  objects  in  the  scene,  die  neighborhood  shape  was  chosen 
to  be  square.  The  computation  time  was  die  main  factor  in 
selecting  the  neighborhood  size;  it  was  decided  diat  a  9  x  9 
neighborhood  was  appropriate. 

Since  the  new  filter  is  specifically  designed  for  MRI,  it 
outperforms  conventional  mediods,  iiKhiding  median,  median- 
average,  spatial  smoothing,  and  low-pass  filtering.  Application 
of  the  MLE  +  AVG  version  of  the  new  filter  as  a  preprocessing 
step  before  the  eigenimage  filtering  improved  the  CNR’s  of 
the  resulting  eigenimages  by  a  factor  between  15  and  60 
in  a  reasonable  computation  time  (3  to  4  min  on  a  Sun 
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Fig.  8.  One-dimensional  signal  extracted  from  the  simulation  study  by 
considering  all  pixels  on  the  170th  row  of  the  third  original  image  (Fig.  4(c)): 
(a)  Without  noise:  (b)  with  zero-mean  Gaussian  noise;  (c)  restored  using  MLE 
+  .AVG  filter  (d)  restored  using  AVG  filter.  For  both  filters,  a  threshold  value 
of  4<t  (<7  =  8.75)  and  a  neighborhood  size  of  9  x  9  was  used. 


TABLE  U 

CNR's  FOR  THE  MLE  +  AVG  Preprocessed  Eioenimages 
OF  THE  Simulation  and  Brain,  Using  Several 
Neighborhood  Sizes  (.VS)  and  Threshold  Values  (7) 


CNRi _ CNRj 


NS 

7 

Simulation 

Brain 

Simulation 

Brain 

3  X  3 

la 

3.07 

3.81 

3.28 

3.84 

3x3 

2cr 

3.78 

5.37 

4.07 

5.28 

3  X  3 

4(7 

12.72 

8.49 

14.44 

7.82 

3x3 

6(7 

14.09 

8.23 

15.93 

7.56 

3x3 

8(7 

14.08 

7.75 

15.91 

7.08 

5x5 

1(7 

3.11 

3.87 

3.33 

3.91 

5x5 

2(7 

5.11 

6.40 

5.58 

5.93 

5x5 

4(7 

23.93 

11.36 

27.80 

12.04 

5x5 

6(7 

25.91 

10.24 

30.06 

12.02 

5x5 

8(7 

25.90 

9.05 

30.02 

10.15 

7x7 

1(7 

3.19 

3.96 

3.42 

4.00 

7  X  7 

2<t 

6.24 

7.07 

6.86 

6.74 

7  X  7 

4(7 

37.61 

15.22 

43.22 

16.75 

7  X  7 

6(7 

40.55 

13.80 

46.10 

18.72 

7  X  7 

8(7 

40.61 

12.05 

46.09 

16.35 

9x9 

1(7 

3.28 

4.11 

3.53 

4.17 

9x9 

2(7 

7.01 

7.45 

7.77 

7.53 

9x9 

4(r 

52.64 

17.64 

58.68 

22.60 

9x9 

6(7 

55.22 

16.84 

58.83 

27.63 

9x9 

8(7 

54.68 

15.23 

57.68 

26.94 

PARCstation  2)  while  preserving  and  enhancing  edges.  The 
computation  time  may  significantly  be  reduced  by  using  a 
P^llel  processor  and/or  by  a  Kalman  filtering  implementation 
the  MLE.  Details  of  these  implementations  were  beyond  the 
^cope  of  this  paper.  As  explained  in  Section  VI-B,  our  intended 
“tse  of  the  Kalman  filter  will  increase  the  speed  in  a  tradeoff 


table  hi 

CNR  s  FOR  THE  Original  and  Restored  Images,  and  Preprocessed  or 
POSTPROCF.SSFD  EiCEMMACES  OF  THE  SIMULATION  AND  BRAIN 


Original 

1st  MS  El 

•9.15 

-7,85 

■5.15 

-4  5'^ 

images 

■2nd  MSEI 

-6,88 

■7.70 

■16.64 

-20.39 

and 

3rd  .MSEI 

-4.54 

-5.09 

■20.30 

-21.64 

eigen- 

4th  MSEI 

-3.08 

•3.81 

■18., 52 

■24.19 

images 

Eigenimage 

0.94 

1.28 

0.96 

1.56 

Restored 

1st  MSEI 

•1-21.88 

-14.79 

•72..53 

-14,09 

images 

2nd  MSEI 

-195.51 

-16.71 

■435.62 

■78.12 

using 

3rd  .MSEI 

-140.14 

-16.94 

-407.42 

-127  71 

MLE-fAVG 

4th  .MSEI 

•114.24 

-16.66 

-.333.06 

■131.79 

Eigen- 

MLE 

3.05 

3,77 

3.26 

3.81 

images 

AVG 

17.06 

19.37 

14.40 

21.17 

with 

.MLE-FAVG 

52,64 

17.64 

58.68 

22.60 

pre- 

Median 

0.93 

1.76 

0.88 

2,04 

processing 

.Median-.AVG 

10.95 

10.35 

10.18 

12.08 

Eigen- 

SnKxith  3-1 

2.92 

4.94 

3.09 

5.14 

images 

Smooth  G-9 

7,84 

10.99 

7.92 

14.08 

with 

AVG 

19.01 

8.11 

15.16 

18.28 

post- 

Median 

1.11 

2.78 

1.01 

7.96 

processing 

Median- AVG 

10.30 

8.61 

9.80 

24.02 

CNRi 

cnr; 

Simulation  Brain 

Simulation  Brain 

table  IV 

A  Comparison  of  Estimated  and  Actual  Volumes  for  Egg  White  and 
Egg  Yolk  in  an  Egg  Phantom  and  the  Central  Region  (Wwie  Matter) 
IN  the  Simulation.  All  Volumes  Am  (Juoted  in  Cubic  Cenitmetehs 


phantom 

using  water 
displacement 

using 

acquired  images 

using  AVG  pre- 
processed  images 

egg-white  s  volume 

34.8 

34.3 

34.2 

egg- yolk’s  volume 

21.8 

21.7 

21.3 

simulation 

using  noise¬ 
less  images 

using  noisy 
'images 

using  AVG  pre- 
processed  images 

central  region’s  volume 

58.13 

57.19 

58.68 

with  performance  and  is  in  a  totally  different  context  than  that 
conventionally  applied  to  image  restoration. 

Experimental  results  showed  that  the  AVG  version  of  the 
new  restoration  filter  preserved  partial  volume  information. 
These  results  indicate  diat  the  new  multidimensional  filter 
is  successful  in  MR  image  restoration.  It  can  be  used  as 
a  preprocessing  step  to  greatly  improve  the  performance 
of  image  combination  techniques,  pattern  recognition  and 
classification  algorithms,  and  tissue  characterization  methods. 
In  a  clinical  study  of  human  cerebral  infarcts,  it  improved 
detectability  of  anatomical  structures  and  tissue  characteristics 
(31]. 

Between  the  two  versions  of  the  filter  (MLE  +  AVG  and 
AVG)  with  and  without  using  the  MRI  signal  models,  the 
MLE  +  AVG  version  is  most  appropriate  when  processing 
images  acquired  using  only  one  pulse  sequence,  e.g.,  a  multiple 
spin-echo  sequence,  and  thin  slices.  This  is  because  1)  to  our 
knowledge,  there  is  no  simple  signal  model  diat  accurately 
fits  a  combination  of  different  pulse  «vpKiKT8,  e.g.,  a  mul¬ 
tiple  spin-echo  and  an  inversion  recovery,  and  2)  for  thick 
slices,  a  simplified  signal  model,  e.g,,  the  one  given  in  (2), 
generates  partial  volume  artifacts  for  overlapping  tissues  with 
significantly  different  intrinsic  tissue  parameters,  e.g.,  white 
matter  and  CSF.  On  the  other  hand,  the  AVG  version  may  be 
used  for  any  MRI  scene  sequence  acquired  using  an  arbitrary 
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combination  of  pulse  sequences.  An  important  point,  however, 
is  that  if  the  noise  level  changes  from  image  to  image, 
appropriate  scales  should  be  considered  in  the  restoration 
filter,  i.e..  a  scaled  version  of  the  Euclidean  distance  to  be 
used,  or  images  should  be  scaled  before  the  application  of  the 
restoration  filter, 

,-\lthough  this  paper  focused  on  the  development  and  appli¬ 
cation  of  the  restoration  lilter  for  MRl.  its  idea  and  theoretical 
bases  are  not  limited  to  ,VIRI,  In  particular,  variations  of  the 
proposed  filter  can  be  applied  to  images  other  than  MRI.  We 
have  successfully  applied  them  to  the  computed  tomography 
(CT).  nuclear  medicine  (NM),  and  synthetic  aperture  radar 
(SAR)  images.  A  manuscript  explaining  these  applications  and 
issues  associated  with  them  is  in  preparation. 

Acknowledgment 

The  authors  would  like  to  thank  L.  Bower  and  S.  Ramsey 
for  their  programming  skills.  They  are  also  grateful  to  the 
associate  editor  and  anonymous  reviewers  whose  comments 
improved  presentation  of  the  paper. 

References 

[  1 1  J.  B.  de  Castro  er  at..  “MR  subtraction  angiography  with  matched  filter.” 

J.  Comput.  Assist.  Tomogr.,  vol.  12.  no.  2,  pp.  355-362.  1988. 

[2|  J.  P.  Windham,  M.  A.  Abd-Allah,  D.  A.  Reimann,  J.  W.  Froelich,  and  A. 
M.  Haggar,  “Eigenimage  filtering  in  MR  imaging,"  J.  Comput.  Assist. 
Tomogr..  vol.  12.  no.  1,  pp.  1-9,  1988. 

[3]  H.  Soltanian-Zadeh.  J.  P.  Windham,  and  J.  M.  Jenkins.  "Error  propa¬ 
gation  in  eigenimage  filtering,"  IEEE  Trans.  Med.  Imaging,  vol.  9,  no. 
4.  pp.  405-420,  1990. 

14]  H.  Soltanian-Zadeh.  J.  P.  Windham.  D,  J.  Peck,  and  A.  E.  Yagle,  “A 
comparative  analysis  of  several  transformations  for  enhancement  and 
segmentation  of  magnetic  resonance  image  scene  sequences,"  IEEE 
Trans.  Med.  Imaging,  vol.  1 1.  no.  3.  pp.  302-318.  1992. 

[5]  E.  R.  McVeigh.  R,  M.  Henkelman.  and  M.  J.  Bionskill,  “Noise  and 
filtration  in  magnetic  resonance  imaging,"  Med.  Phys.,  vol.  12,  no.  5, 
pp.  586-591,  1985. 

16]  F.  W.  Wehrli.  “Signal-to-noise  and  contrast  in  MR  imaging,”  in  NMR  in 
Medicine,  the  Instrumentation  and  Clinical  Applications  (S.  R.  Thomas 
and  R.  L.  Dixon,  Eds.).  Amer.  Assoc.  Phys.  Medicine,  Medical 
Physics.  1986,  pp.  216-228.  Monograph  No.  14, 

17]  R.  E.  Hendrick,  "Image  contrast  and  noise,”  in  Magnetic  Resonance 
Imaging.  St.  Louis:  Stark  and  Biadly,  Mosby-Year  Book,  1988,  pp. 
66-83.  1st  ed..  ch.  5. 

[8]  P.  Perona  and  J.  Malik.  “Scale-space  and  edge  detection  using 
anisotropic  diffusion,"  IEEE  Trans.  Pan.  Anal.  Macine  Intel!.,  vol. 
12,  no.  7,  pp.  629-639.  1990. 

[9]  P.  Saint-Marc.  J.  S.  Chen,  and  G.  Medioni,  “Adaptive  smoothing:  A 
general  root  for  early  vision,”  in  Proc.  IEEE  Comput.  Soc.  Corf.  Comput. 
Vision  Pan.  Recogn..  1989,  pp.  618-624. 

[10]  A.  C.  Bovik,  T.  S.  Huang,  and  D.  C.  Munson,  “A  generalization  of 
median  filtering  using  linear  combination  of  order  statistics,”  IEEE 
Trans.  Acoust,  Speech.  Signal  Processing,  vol.  ASSP-31,  no.  12,  pp. 
1342-1350,  1983. 

[11]  J.  B.  Bendar  and  T.  L.  Watt.  “Alpha  trimmed  means  and  their  relation¬ 
ship  to  median  filters."  IEEE  Trans.  Acoust,  Speech,  Signal  Processing, 
vol.  ASSP-32.  no.  2,  pp.  145-153.  1984. 

[12]  Y.  H.  Lee  and  S.  A.Kassam,  “Generalized  median  filtering  and  related 
nonlinear  filtering  techniques,”  IEEE  Trans.  Acoust.  Speech,  Signal 
Processing,  vol.  ASSP-33.  no.  3,  pp.  672-683,  1985. 

[13]  R.  Ding  and  A.  N.  Venetsanopoulos,  “Generalized  homomorphic  aixl 
adaptive  order  statistic  filters  for  the  removal  of  impulsive  and  signal- 
dependent  noise.”  IEEE  Trans.  Circ.  Syst...  vol.  CAS-34.  no.  8,  pp. 
948-955.  1987. 

[14]  W.  B.  McCain  and  C.  D.  McGillem,  "Performance  improvement 
of  DPLL's  in  non-gaussian  noise  using  robust  estimators.”  IEEE 
Trans.  Acoust.  Speech.  Signal  Processing,  vol.  ASSP-35.  no  11  pp 
1207-1216.  1987. 


115]  A.  Restrepo  and  A.  C.  Bovik.  ".Adaptive  trimmed  mean  filters  tor  image  ^ 
tesiomion."  IEEE  Trans.  Acoust.  Speech.  Signal  Processing  vol  36  no  ^ 
8.  pp.  1326-1337,  1988.  I 

[16]  .M.  Pielikainen  and  D.  Harwood,  "Segmentation  of  color  images  using 
edge -preserving  filters,"  in  .Advances  in  Image  Processing  and  Pat. 
tern  Recognition  (V.  Cappellini  and  R.  Marconi,  Eds.i.  Amsterdam 
Elsevier.  .\onh  Holland.  1986,  pp.  94-99. 

[17]  H.  L.  Van  Trees.  Detection.  Estimation,  and  Modulation  Theorv.  Ne-* 
York:  Wiley,  1968. 

[18]  H.  Soltanian-Zadeh.  J.  P.  Windham,  and  D.  0.  Hearshen,  '  Pre. 
processing  of  MR  imap  sequences  using  a  new  edge-preserving 
multi-dimensional  filter."  Presented  at  the  lOth  Ann.  Mtg.  Soc.  .Mag.  ' 
Res.  .Med.  (SMRM).  San  Francisco,  CA,  .Aug.  10-16,  1991  (Abstract  ( 
published  in  the  SMRM  Book  of  .Abstracts,  vol.  2.  p.  748,  1991).  t 

[19]  J.  P.  Windham.  A.  M.  Haggar.  D.  O.  Hearshen.  J  R.  Roebuck,  and 

D.  A.  Reimann.  "A  novel  method  for  volume  determination  using  MR 
image  sequence,"  Presented  at  the  7th  .Ann.  Mtg.  Soc,  Mag,  Res.  Med. 
(SMRM).  San  Francisco.  CA,  Aug.  20-26,  1988  ( Abstract'published  in 
the  SMRM  Book  of  Abstracts,  vol.  2.  pp.  1081,  1988).  * 

[20]  H.  Stark  and  J.  W,  Woods.  Probability.  Random  Processes,  and  Estima¬ 
tion  Theory  for  Engineers.  Englewood  Cliffs:  Prentice-Hall,  1986.  ! 

[21]  J.  N.  Lee  and  S.  J.  Riederer,  "The  conoast-to-noise  in  relaxation  time,  * 
synthetic,  and  weighted-sum  MR  images,"  Mag.  Res.  Med.,  vol.  5  pp  I 
13-22,  1987. 

[22]  D.  G.  Brown  et  al.,  “CNR  enhaiKement  in  the  presence  of  multiple 
interfering  processes  using  linear  filters,”  Mag.  Res.  Med.,  vol.  14  no 
1,  pp.  79-96,  1990. 

[23]  R.  B.  Buxton,  ‘Target-point  combination  of  MR  images,"  Magn.  Res 
Med.,  vol.  18.  no.  1.  pp.  102-115,  1991. 

[24]  S.  L.  Miyer,  Data  Analysis  for  Scientists  and  Engineers.  New  York: 
Wiley,  1975,  pp.  39-48. 

[25]  M.  A.  Bernstein,  D.  M.  Thomasson,  and  W.  H.  Perman,  “Improved 
detectability  in  low  SNR  magnetic  resonance  images  by  means  of  phase- 
corrected  real  reconstruction.”  Med.  Phys.,  vol.  16.  no.  5.  pp.  813-817 
1989. 

[26]  H.  Soltanian-Zadeh.  J.  P.  Windham,  and  A.  E.  Yagle:  “Optimal  transfor¬ 

mation  for  correcting  partial  volume  averaging  effects  in  magnetic  res¬ 
onance  imaging.”  IEEE  Trans.  Nuc.  Sci.,  vol.  40,  no.  4,  pp.  1204-1212 
1993.  '  , 

[27]  S.  M.  Kay.  Modem  Spectral  Estimation.  Theory  and  Application.  En¬ 
glewood  aiffs:  NJ:  Prentice-Hall,  1988. 

[28]  J.  L.  Devon,  Probability  and  Statistics  for  Engineering  and  the  Sciences. 

Monterey,  CA:  Btooks/Cole,  2nd  ed..  1987. 

129]  P.  Haavisto,  P.  Heinonen.  and  Y.  Neuvo,  “Vector  FIR-median  hybrid 
filters  for  multispectral  signals,”  Electron.  Lett.,  vol.  24.  no.  1,  pp.  7-8 
1988. 

[30]  D.  J,  Peck.  J.  P.  Windham.  H.  Soltanian-Zadeh.  and  J.  Roebuck.  “A  fast 
and  accurate  algorithm  for  volume  determination  in  MRI,"  Med.  Phvs 
vol.  19.  no.  3,  pp.  599^-605,  1992. 

[31]  J.  P.  Windham,  H.  Soltanian-Zadeh,  and  D.  J.  Peck.  “Delineation  of 
internal  structure  in  cerebral  nimors  using  MRI,”  Presented  at  the  34th 
Ann.  Mtg.  Aroer.  Assoc.  Phys.  Medicine  (AAPM),  Calgary,  Canada, 

Aug.  1992  (Abstract  published  in  Med.  Phys.,  vol.  19,  no.  3,  p.  844, 
1992). 


HamM  Sottanian-Zadeh  (S’89-M’92)  was  bom  in 
Yazd,  Iran,  in  1960.  Ife  received  the  B.S.  and  M.S. 
degrees  in  electrical  engineering:  electronics  front 
Tehran  University,  Tehnm,  ban,  in  1986  and  the 
M.S.E.  and  Fh.D.  degrees  in  electrical  engineering: 
Systems  from  the  Univenity  of  Michigan,  Ann  i 
Arbor,  in  1991  and  1992,  respectively. 

From  1985  to  1986,  he  was  with  ban  Telecom¬ 
munication  Research  Center  in  Tehran,  bi  1987,  he 
was  a  lecturer  of  electrical  engineering  at  Tehran 
University.  Since  Septen^  1988,  he  has  been  with 
the  Department  of  Diagnostic  Radiology  and  Medical  Imaging  of  Henry  Ford  f 
Hospital,  Detroit,  MI,  where  he  is  currently  a  Senior  Staff  Scientist.  He  f 
has  also  been  with  the  Department  of  Electrical  Engineering  and  Computer 
Science  of  the  University  of  Michigan,  where  he  currently  holds  a  Visiting 
Scholar  affiliation.  His  research  interests  include  medical  imaging,  image 
reconstruction  and  processing,  pattern  recognition,  and  neural  networks 


50LTAM-\'’-2''‘DEH  j/  ml  LTIDfVIENSiONAL  NONLINEAR  EDGE-PRESERVING  FILTER 


Joe  P.  Windham  iM'92)  received  the  Ph  D  degree 
in  nuclear  science  and  engineering  from  the  Univer¬ 
sity  of  Cincinnati,  OH,  in  1972, 

He  was  at  the  ,Medical  College  of  Ohio,  Toledo, 
from  1972  to  1984,  where  he  was  an  ,As.sociate 
Professor  and  Head  of  the  Division  of  ,V1edical 
Physics  of  the  Depanment  of  Radiologv.  He  has 
been  with  the  Depanment  of  Radiology  of  Henry 
Ford  Hospital,  Detroit,  ,V1I,  since  June  1984,  where 
he  IS  currently  Head  of  the  Division  of  Radiological 
Physics.  He  is  certified  by  the  .American  Board 
Radiology  in  Radiological  Physics.  His  research  interests  include  image 
processing  and  quantitative  analysis  ot  medical  images  panicularly  images  ob- 
lained  from  magnetic  resonance,  computed  tomography,  and  nuclear  medicine. 

Dr.  Windham  is  active  in  the  American  College  of  Radiology,  where  he  is 
a  member  ot  the  Commission  on  Physics  and  Radiation  Safetv.  Commission 
on  Education,  and  Commission  on  Radiation  Oncology.  He  is  chairman  of 
ihe  Committee  on  Education  under  the  Commission  on  Physics  and  Radiation 
Safety. 


Andrew  E.  Yagle  (,M'S5i  was  bom  in  Ann  Arbor. 
MI.  in  1956,  He  received  the  B  ,S  E.  and  B  S  E  F 
degrees  from  the  University  of  Michigan.  \nn  Ar¬ 
bor,  in  1977  and  IQiR.  respectivelv .  anu  ihc  .s  M  . 
E.E..  and  Ph.D.  degrees  from  the  Mas.a.huscii- 
Institute  of  Technology  (.VlITi.  Cambridge,  m  iosi. 
1982.  and  1985.  respectivelv 
While  at  MIT.  from  1982  to  I98,\  he  wa-  -m 
an  E.s.xon  Teaching  Fellowship.  Since  September 
1985,  he  has  been  with  the  Department  of  Electneal 
Engineering  and  Computer  Science,  the  Universitv 
of  Michigan.  .Ann  Arbor,  where  he  is  currently  an  Associate  Profes^or 
His  research  interests  include  fast  algomhms  for  digital  signal  processing, 
multiresolution  and  iterative  algorithms  in  medical  imaging,  multidimensional 
inverse  scattering,  phase  retrieval,  and  linear  least-squares  estimation. 

Dr.  Yagle  received  the  NSF  Presidential  Young  Investigator  .Award  in  1988 
and  the  ONR  Young  Investigator  Award  in  I990’  He  received  H.H.  Rackham 
School  of  Graduate  Studies  Research  Partnership  .Awards  with  T  -S  Pan  in 
1990  and  with  K.R.  Raghavan  in  1993.  He  has  received  several  teaching 
awards,  including  the  College  of  Engineering  Teaching  Excellence  Award  lii 
1992.  the  Eta  Kappa  Nu  Professor  of  the  Year  Award  in  1990.  and  the  Class 
of  I938e  Distinguished  Service  Award  in  1989.  He  is  currentlv  an  Associate 
Editor  of  the  IEEE  SIGNAL  PROCESSING  LETTERS,  IEEE  TRANSAC¬ 
TIONS  ON  IMAGE  PROCESSING,  and  Multidimensional  Systems  and  Signal 
Processing.  He  is  also  a  member  of  the  Digital  Signal  Processing  Technical 
Committee,  a  past  Associate  Editor  of  the  IEEE  TRANSACTIONS  ON 
SIGNAL  PROCESSING,  and  co-technical  chair  of  ICASSP-95,  which  will 
be  held  in  Detroit.  MI. 


APPENDIX  T 

P.  Raadhakrishnan,  J.  Dorband,  and  A.E.  Yagle,  “An  Algorithm  for  For¬ 
ward  and  Inverse  Scattering  in  the  Time  Domain,’’  submitted  to  J.  Acoust. 
Soc.  Am.,  September  1993. 

As  noted  above,  the  invariant  imbedding  algorithm  used  to  generate  the  forward 
problem  data  is  computationally  intensive.  This  is  because  invariant  imbedding,  although 
similar  to  layer  stripping  in  approach,  is  quite  different,  in  that  it  does  not  take  advantage 
of  time  causality.  Hence  it  solves  many  more  problems  than  are  actually  needed.  Layer 
stripping  avoids  this  and  is  more  efficient. 

This  paper  is  a  first  attempt  at  parallelizing  invariant-imbedding-based  algorithms  for 
both  the  forward  and  the  inverse  problems,  and  thus  reducing  the  computation  required. 
Only  the  1-D  problem  is  considered  here.  The  new  algorithm  is  parallelizable,  and  hence 
requires  less  computation,  and  gives  results  identical  to  the  invariant  imbedding  algorithm 
of  Corones  et  al.  An  simple  error  analysis  of  the  effects  of  computational  noise  on  the 
algorithm  is  also  supplied. 
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This  paper  presents  an  alternative  numerical  scheme  for  a  class  of  forward  and  inverse  scattering 
problems  based  on  invariant  imbedding  method  [l].  The  algorithm  presented  in  this  paper  is  decou¬ 
pled  more  than  the  previous  implementations  and  hence,  readily  amenable  to  parallel  processing.  It 
is  also  shown  that  the  effect  of  this  decoupling  on  error  can  be  expressed  in  terms  of  the  estimated 
reflection  coefficients,  facilitating  the  decision  to  accept  or  reject  any  computed  parameter. 


1  Introduction 


When  a  wave  propagates  through  a  medium  with  varying  material  properties,  the  velocity  of  the 
wave  changes.  The  process  of  such  wave  propagation  is  called  wave  scattering.  Many  interesting 
scattering  problems  arise  in  the  real  world  in  physical  processes  such  as  speech  processing,  geo¬ 
physics,  acoustics,  transmission  line  modeling,  etc.  The  study  of  the  scattering  process  is  divided 
into  two  main  categories:  forward  and  inverse  scattering.  The  forward  scattering  problem  deals 
with  the  analysis  of  the  wave  propagation  within  a  medium.  The  inverse  problem  deals  with  the 
reconstruction  of  the  medium  properties  from  the  knowledge  of  a  time  sequence  of  reflections  from 
the  medium. 

There  are  two  major  approaches  for  solving  the  scattering  problems.  One  approach  is  based  on 
the  integral  method  such  as  Merchenko  formulation,  and  the  other  is  based  on  the  differential  layer 
stripping  methods.  Differential  methods  also  lead  to  many  interesting  physical  interpretations. 
In  [8],  a  host  of  differential  methods  for  solving  inverse  scattering  were  discussed  and  a  unified 
framework  was  established  for  many  algorithms.  As  noted  in  [8],  for  computational  reasons,  the 
most  popular  algorithms  used  to  solve  scattering  problems  are  dynamic  deconvolution  and  differ¬ 
ential  methods  in  time  or  frequency  domains.  The  differential  methods  implicitly  assume  that  the 
medium  is  inhomogeneous  but  sufficiently  smooth.  Smoothness  assumption  enables  one  to  view 
the  medium  as  a  physical  system  consisting  of  multiple  layers,  across  which  the  medium  properties 


vary  in  a  smooth  but  inhomogeneous  manner.  By  subdividing  the  smooth  inhomogeneous  medium 
into  differential  layers  in  which  the  medium  properties  remain  constant,  the  necessary  scattering 
equations  are  derived.  Solutions  to  two  adjacent  layers  are  connected  by  the  common  boundary 
condition.  In  a  differential  setup,  the  inverse  problem  deals  with  the  material  reconstruction  in  a 
thin  differential  layer,  whereas  the  forward  problem  deals  with  the  construction  of  the  reflection 
kernel  differentially.  By  propagating  the  solution  across  the  layers,  the  whole  system  is  solved  for 
the  forward  or  the  inverse  problems. 

The  time  domain  forward  and  inverse  scattering  problems  allow  one  to  reconstruct  the  inho¬ 
mogeneous  medium  locally  [1]  -  [3].  In  [1],  a.n  analytical  method  based  on  invariant  imbedding 
([3],  [7])  was  proposed  to  solve  for  the  forward  and  inverse  scattering  problems  in  time  domain.  A 
non-linear  differential  integral  equation  was  derived  in  terms  of  the  reflection  kernel  of  the  inhomo¬ 
geneous  medium  using  invariant  imbedding  techniques.  Because  the  reflection  kernel  is  connected 
to  the  medium  properties,  computing  the  reflection  kernel  in  a  local  region  enables  one  to  compute 
the  material  properties  in  that  region.  Hence,  the  same  equation  was  used  for  the  forward  and  the 
inverse  scattering  problems.  In  frequency  domain  problems,  computation  of  the  reflection  kernel  at 
any  depth  involves  the  knowledge  of  the  reflection  kernel  at  different  depths,  as  can  be  seen  from 
the  derivation  presented  below. 

This  paper  deals  with  a  reformulation  of  a  class  of  1-D  forward  and  inverse  scattering  problems, 
employing  an  invariant  imbedding  method.  The  paper  is  divided  into  five  main  sections.  Section 
2  presents  the  formulation  of  the  problem,  the  necessary  propagation  equation,  and  the  available 
computational  algorithms  [1]  and  their  reduntancies.  Section  3  presents  an  efficient  reformulation 
of  the  existing  computational  algorithms  to  derive  the  modified  algorithms.  A  quick  derivation  of 
the  error  bound  for  the  modified  forward  problem  is  presented  in  section  4.  Section  5  deals  with  the 
issues  in  numerical  simulations  of  the  modified  forward  and  the  inverse  problems.  For  illustration 
purposes,  the  timings  for  the  forward  and  the  inverse  problems  are  also  presented. 


2  Problem  Description 


In  this  section  we  describe  the  problem  presented  in  [1]  -  [3].  (Refer  to  [1]  -  [2]  for  the  details  of  the 
problem).  An  alternative  brief  derivation  of  the  differential  integral  equations  is  presented  below. 

To  derive  the  mathematical  relationship  between  the  reflection  kernel  and  the  material  proper¬ 
ties  of  the  inhomogeneous  medium,  the  following  assumptions  were  made  in  [1]: 


1.  The  material  properties  of  the  inhomogeneous  region  of  the  medium  are  sufficiently  smooth 
and  vary  as  a  function  of  the  depth  z  of  the  medium. 

2.  The  medium  is  inhomogeneous  in  the  interval  oq  <  z  <  bo  and  homogeneous  elsewhere. 

3.  The  material  properties  for  ^  <  ao  and  z  >  bo  match  those  z  =  ao  and  =  bo,  respectively. 
Hence,  the  incident  and  the  reflected  wavefields  can  be  easily  identified. 


Fig.  1  shows  the  velocity  profile  c(z)  of  the  wavefield.  The  velocity  c(z)  in  the  inhomogeneous 
region  of  the  medium  (gq  <  z  <  bo)  is  continuous  at  the  boundaries  oq  and  bo- 


The  basic  idea  behind  invariant  imbedding  methods  is  to  subdivide  the  interval  [ao,ho]  into  n 
smaller  intervals  with  varying  widths,  and  solve  the  scattering  equation  for  a  thin  layer  with  known 
boundary  conditions.  Using  this  solution,  the  scattering  problem  for  the  next  layer  can  be  solved 
and  the  solution  can  then  be  propagated  to  the  other  end.  In  doing  so,  one  implicitly  assumes  that 
9  outside  the  boundaries  of  each  thin  layer,  the  medium  is  homogeneous,  i.e.,  two  layers  Interact  only 

by  the  common  boundary.  Hence,  the  resulting  equations  are  functions  of  the  varying  boundary 
coordinates.  We  derive  the  necessary  non-linear  differential  integral  equations  presented  in  [1]  based 
on  the  two-component  model  of  the  wave  propagation  [8]. 


2.1 


Propagation  Equation 


The  propagation  equation  is  derived  using  the  two-component  frequency  domain  model  in  terms  of 
the  travel-time  coordinate  defined  by  ([1],  [5],  [8]): 


(1) 


and 


d 

'  D  ' 

—iu  r{x) 

'  b  ' 

dx 

U  _ 

r(x)  iu 

_  ii  _ 

where  £>  =  D{x,(jj)  and  U  =  U{x,u)  are  the  wave  components  along  the  positive  and  the 
negative  x  directions,  respectively.  Letting  R{x,u))  =  U{x,lj)/ I){x,uj)  and  differentiating  R{x,u) 
with  respect  to  x  leads  to  the  following  equation: 


-^R{x,u)  =  2iuR{x,u)  -f  r{x)  -  r{x)R^{x,u;). 
ax 


(3) 


As  mentioned  in  the  introduction,  the  frequency  domain  equations  involve  the  knowledge  of  the 
reflection  kernel  at  all  depths  and  hence  local  reconstruction  is  not  favored.  In  order  to  solve  the 
local  properties,  the  time  domain  equation  is  derived  by  taking  an  inverse  Fourier  transform  of 
equation  (3).  The  associated  boundary  condition  can  be  derived  using  the  final  value  theorem. 


dx 


R  =  2—R  -  r(x)(R(x,t  -  x)  (Si  R(x,  t  -  x))  +  r(x)S(t) 
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where  0  denotes  the  linear  convolution  in  time  variable. 


R{x,y,0+)  = 


r  X 
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R{y,y,t)  =  0;  <  >  0. 


(4b) 


where  A{x)  =  2r(x}.  Equation  (4a)  is  essentially  the  Ricatti  equation  derived  in  [1].  Travel  time 
coordinate  transformation  can  also  be  interpreted  as  the  velocity  normalization.  Hence,  equation 
(4a)  can  be  further  reduced  to  the  following  form: 

a 

-^R  =  -r{x)(R{x,t  -  i)  0  R{x,t  -  x))  +  r{x)S{t)  (5) 

In  the  forward  problem,  we  are  given  A(x)  for  0  <  x  <  $(6o)  and  using  equation  {4a)  we  can 
solve  for  the  reflection  kernel  of  the  medium,  R{0,^{bo),t)  =  R'^{ao,bo,t).  In  the  inverse  problem, 
we  are  given  a  finite  initial  section  of  the  reflection  kernel  R'^{aQ,bQ,t)  for  0  <  <  <  2$(6o)-  (2$(I)o) 
is  the  round  trip  travel  time  for  a  pulse  to  go  to  the  depth  6o  and  return  to  the  surface.)  From  the 
knowledge  of  this  data,  A{x)  is  reconstructed  for  0  <  x  <  $(6o)  in  a  differential  manner.  Starting 
with  the  knowledge  of  R{x,  y,  t)  at  depth  x,  A(x)  is  computed.  Next,  R{x  +  Sx,  y,  t)  is  computed  by 
advancing  one  layer  into  the  medium  using  equation  (4a)  and  the  knowledge  of  A(x)  at  depth  x. 
Solution  of  .R(x  +  ^x,  y,t)  is  then  used  to  compute  A{x  +  Sx).  This  recursive  procedure  is  essentially 
the  dynamic  deconvolution  process. 


2.2  Computational  Algorithms  for  the  Ricatti  Equation 


Assuming  that  the  travel-time  coordinates  are  normalized  to  unity,  we  set  y  =  1  and  represent 
R{x,  y,  t)  as  R{x,  t).  This  is  equivalent  to  fixing  the  left  boundary  of  the  problem.  Furthermore,  we 
set  the  computational  grid  for  the  problem  to  be  0  <  x  <  1  and  0  <  /  <  2(1  -  x).  In  this  setup,  for 
the  forward  problem,  A(x)  is  known  and  i?(0,t)  has  to  be  computed  for  0  <  t  <  2.  For  the  inverse 
problem,  R{0,  t)  is  specified  for  0  <  t  <  2  and  A{x)  has  to  be  obtained  for  0  <  x  <  1. 

Using  the  Trapezoidal  rule  and  integrating  over  the  interval  (xo,  xq  +  c),  equation  (5)  can  be 
rewritten  as: 


R{xo  +  €,t  -  2(xo  +  c))  -  R{xo,  t  -  2xq)  = 


-^[A(xo+e)i2(io+f,  <-2(a;o+f))®-R(io+f,<-2(xo+6))+A(xo)^(a;o,f-2a:o)®.R(a:o,f-2xo)].(6) 

The  next  two  subsections  present  the  numerical  schemes  for  the  forward  axid  the  inverse  scat¬ 
tering  problems,  respectively. 


2.2.1  Numerical  Algorithm  for  the  Forward  Problem 


For  the  forward  problem,  the  discrete  boundary  condition  R,,o  -  -AJA  is  given,  and  we  solve 
equation  (6)  for  the  reflection  kernel  by  moving  from  right  to  left  and  bottom  to  top  along  the 
computational  grid  given  in  Fig.  2. 

The  numerical  value  of  the  reflection  kernel  at  the  circled  point  in  Fig.  2  is  computed  from  the 
values  of  the  reflection  kernel  at  the  grid  points  marked  x.  The  corresponding  algorithm  is  given 
by: 


•  For  2  =  n  to  0  Begin 

•  Ri,o  =  -^./4 

•  For  j  =  0  to  n  —  2  Begin 

•  Ri^j  =  (1  +  e^/1^/8)  ^  +  +  Y  +  Ri,kRiJ-k^^ 

•  End 

•  End. 


2,2.2  Numerical  Algorithm  for  the  Inverse  Problem 


For  the  inverse  problem,  the  discrete  boundary  condition  A,-  =  -4i2,,o  is  given.  Equation  (6)  is 
then  solved  for  the  reflection  kernel  and  A,  simultaneously  by  moving  from  left  to  right  and 
top  to  bottom  along  the  computational  grid  shown  in  Fig.  3. 

The  numerical  value  of  the  reflection  kernel  at  the  circled  point  in  Fig.  3  is  computed  from  the 
values  of  the  reflection  kernel  at  the  grid  points  marked  x.  The  corresponding  algorithm  is  given 
by: 


•  Aq  =  — 4iEo,o 

•  For  2  =  0  to  n  -  1  Begin 

•  Rifi  —  AiRi^QRi^i 

•  Ai  =  -4iZ,+i,o 


•  For  j  =  0  to  n  -  2  Begin 

.  =  (l-e2A?+i/8)-i 

•  End 


+  V  (-^'+1  ^k={  ^  Ri+l,kRi+l,j-k  +  Ai  Ri,kRi,j+l-k'^^ 
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•  End. 


From  equation  (6)  and  the  computational  grids  for  the  forward  and  the  inverse  problems,  we 
note  that  the  reflection  kernel  at  depths  xq  and  xq  +  e  or  xq  -  e  appear  on  both  sides.  Hence, 
the  computation  of  the  reflection  kernel  at  a  given  depth  x  requires  some  information  about  the 
reflection  kernel  at  the  same  depth.  Because  of  this  coupling,  it  is  preferable  to  modify  equation  (6) 
so  that  the  computation  of  the  reflection  kernel  at  any  given  depth  does  not  require  the  knowledge 
of  any  other  point  of  the  reflection  kernel  at  the  same  depth.  Such  a  modification  can  improve 
performance  in  a  parallel  computer  and  reduce  error  propagation  due  to  the  coupbng  of  terms.  In 
the  next  section  we  show  that  such  a  modification  is  indeed  possible  and  straight  forward. 


3  Reformulation  of  the  Computational  Algorithms  for  the  Ri- 
catti  Equation 


Since  the  medium  is  assumed  to  be  sufficiently  smooth,  the  reflection  kernel  is  a  smooth  function 
and,  hence,  a  continuous  function  of  depth  x.  For  example,  the  reflection  kernel  iE(xo  +  f  ,t-2(io  + 
|))  is  left  and  right  continuous.  This  leads  to  the  following  constraint: 


lim  - 
£—0  € 


{.R(xo+  ^,t-2(xo  +  ^))-  R{xo,t-2xo)}  = 


lim  -{i2(xo  +  €,t  -  2(io  +  c))  -  R{xo  +  -  2(xo  +  -))}• 

£— >0  e  /X 


(7) 


In  the  computational  algorithm  described  in  the  previous  subsection,  e  denotes  the  discrete  step 
size.  Since  the  reflection  kernel  is  a  smooth  function  of  depth,  instead  of  integrating  equation  (4a) 
over  the  interval  (xo,  xq  +  c),  we  can  integrate  it  over  the  intervals  (xq,  xq  +  f )  and  (xq  +  |,  xq  +  e). 
Using  the  results  from  equation  (7),  the  following  forward  equation  can  be  derived: 


^ 

Ri-l,]+l  ~  Ri,j  +  Ri+l,kRi-k-l,j-\-k  +  Ri,kRi,j-k]- 

^  fc=l  /t=l 


(8) 


Clearly,  the  reflection  kernel  has  to  be  computed  for  the  (i  -  1)*‘  layer,  and  index  j  is  bounded 
above  by  (n  -  i  -  1)  instead  of  the  conventional  bound  n  -  i. 

Similarly,  using  equation  (7),  the  discrete  version  of  the  inverse  problem  can  be  derived  as: 


j+i 


(9) 


yt=l 


*:=! 
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We  note  that,  unlike  the  algorithm  in  [1],  the  summation  limits  in  equations  (8)  and  (9)  do  vary 
at  any  layer  level.  Also,  the  modified  equations  allow  one  to  compute  the  reflection  kernel  at  any 
depth  independently  of  the  reflection  kernel  at  that  depth  at  a  previous  time.  This  decoupling 
enables  one  to  compute  the  different  time  records  of  the  reflection  kernel  at  any  fixed  depth  in  a 
parallel  manner.  Moreover,  any  error  propagation  due  to  the  computation  of  the  different  time 
records  of  the  reflection  kernels  will  not  affect  the  reflection  kernel  at  the  same  depth.  Hence,  the 
modified  algorithm  has  slightly  better  control  over  the  error  propagation.  Moreover,  a.s  detailed  in 
the  next  two  subsections,  initialization  of  the  modified  computational  algorithm  also  differs  from 
the  algorithm  in  [1]. 


3.1  Modified  Numerical  Algorithm  for  the  Forward  Problem 

For  the  forward  problem,  the  discrete  boundary  conditions  R,_o  =  -Aif4  and  /Zn-i,i  =  2An/(8  + 
are  used  to  initialize  the  algorithm.  Equation  (6)  is  then  solved  for  the  reflection  kernel  Rij 
by  moving  along  right  to  left  and  bottom  to  top  along  the  computational  grid  shown  in  Fig.  4. 

The  numerical  value  of  the  reflection  kernel  at  the  circled  point  in  Fig.  4  is  computed  from 
the  values  of  the  reflection  kernel  at  the  grid  points  marked  x.  It  is  important  to  note  that  the 
computation  of  the  reflection  kernel  at  the  circled  point  does  not  require  the  knowledge  of  any 
other  reflection  kernel  at  the  same  layer.  Due  to  the  modification  done  on  the  forward  equation, 
the  computational  algorithm  needs  an  extra  initial  point  Rn-\.i  as  follows: 


•  Rn,0  —  ~-4n/4; 

•  Rn-x.i  =2An/(8  +  e2A„2) 

•  For  i  =  n  -  1  to  0  Begin 

•  Ri,o  =  —  Ail  4 

•  For  j  =  0  to  n  -  i  Begin 

•  +  i  =  Ri-\,i  +  Ri+l,kRt+\,j-l-k  +  Ri,kRi,j-k] 

•  End 

•  End. 

3.2  Modified  Numerical  Algorithm  for  the  Inverse  Problem 


In  the  case  of  the  inverse  problem,  the  reflection  kernel  on  the  surface  z  =  0  is  available.  For 
z  <  ao,  the  medium  is  homogeneous,  and  to  initialize  the  algorithm  we  add  another  layer  in  the 
homogeneous  medium.  Due  to  homogeneity,  the  reflection  kernel  of  the  added  layer  will  be  identical 


to  the  time  records  of  the  reflection  kernel  of  the  surface.  Reflection  kernel  at  the  circled  point 
in  Fig.  5  is  computed  from  top  to  bottom  and  from  left  to  right  along  the  computational  grid  as 
shown  below: 


•  -4o  = 

•  For  i  =  0  to  n  -  1  Begin 

•  i2,,o  =  Ri.l  - 

•  Ai  =  -4i2,+i,o 

•  For  j  =  0  to  n  -  i  Begin 

.  -  ^[A(i)ELi  R^,kR^,J-k  +  -  i)ES  Rr-x,kR.-i,j-k] 

•  End 

•  End. 


4  Error  Propagation  for  the  Forward  Problem 


The  modified  equations  are  not  only  decoupled,  but  also  have  the  same  order  of  accuracy  as  the 
existing  algorithm.  The  error  propagation  can  be  tracked  to  the  same  order  of  accuracy.  If,  at  layer 
number  to  and  location  number  jot  error  of  size  A  is  made  in  the  forward  algorithm,  it  can  be 
shown  that  the  error  generated  in  computing  R,j  is  given  by: 


Rij  -  A 


t+i 


Hj  -  jo  -  *0  +  0  “  1 I  ^0  “  jo  *0  +  0 

1=10 


,(10 


after  ignoring  the  third  and  the  higher  order  terms  of  e.  If  A{x)  and  Rij  are  bounded  above  by  0 

M  and  p  respectively,  the  upper  bound  on  the  error  at  point  Rij  due  to  a  computational  error  of 
strength  A  at  Ri^j^  is  given  by: 


Rtj  —  A 


2  2 
1  +  y  Mp(i  -  to)  + 


(11)  • 


where  R  denotes  the  error  contribution.  A  similar  bound  can  be  derived  for  the  inverse  algorithm. 
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5  Numerical  Examples 


In  this  section  we  describe  the  numerical  simulations  performed  using  the  modification  presented 
in  this  paper. 

Simulations  were  aimed  at  testing  the  following  issues: 

•  Check  the  validity  of  the  reformulation  for  the  sequential  version  given  a  fixed  discretization; 

•  Implement  a  completely  parallel  algorithm  for  the  forward  and  inverse  problems; 

•  Compare  the  computational  timings  for  variable  layer  numbers  for  the  sequential  and  parallel 
algorithms. 

Next  section  details  the  initializations  made  in  the  computation. 


5.1  Initializations 


Due  to  the  modification  made  in  the  forward  algorithm,  initialization  of  the  computation  requires 
information  of  three  initial  points:  Rn,o,  Rn-i,o  and  Rn-i,i-  Among  these  three  points,  the  first  two 
are  available  since  .R,,o  =  A(i)/4.  The  third  point  is  given  by  the  expression  Rn-i,i  =  2An/(8  + 
e^An^),  where  c  =  ^  is  the  discrete  step  size  and  n  is  the  number  of  layers.  With  these  initial 
values,  the  computation  is  favored  in  parallel  implementation. 

For  the  inverse  problem,  the  left  boundary  is  the  homogeneous  medium.  Since  the  modification 
needs  the  knowledge  of  two  layers  to  initiate  the  computation,  we  add  an  extra  layer  from  the 
homogeneous  medium.  As  seen  from  Fig.  2,  since  the  homogeneous  medium  properties  match  that 
of  the  inhomogeous  medium  at  the  common  boundary,  all  we  need  to  do  is  to  replicate  the  leftmost 
layer.  This  initializes  the  inverse  algorithm.  The  modified  inverse  algorithm  is  also  well  suited  for 
parallel  implementation. 

The  governing  equations  for  the  forward  and  the  inverse  problems  are  the  same  except  for  the 
initializations.  An  extra  layer  in  the  homogeneous  region  is  used  to  initialize  the  inverse  problem. 
However  the  additional  layer  is  identical  to  the  boundary  layer  between  the  homogeneous  and  the 
inhomogeneous  medium.  This  result  follows  by  the  assumptions  made  in  section  2.  In  fact,  the 
success  of  the  reconstruction  using  the  results  from  the  forward  algorithm  as  the  input  to  the 
inverse  algorithm  will  be  one  of  the  strong  indicators  that  the  modifications  and  the  initializations 
made  in  this  paper  are  valid.  Another  way  to  verify  the  correctness  of  the  modified  equations  is 
to  solve  the  forward  problem  using  the  algorithm  presented  in  section  2.2.1,  and  use  the  computed 
refiection  kernel  to  reconstruct  the  medium  using  the  modified  inverse  algorithm.  These  two  cases 
were  implemented  as  follows: 


5.1.1  Casel:  Computations  using  modified  forward  and  inverse  algorithms 

•  .4(1 )  =  sin(4TTx)  +  2 

•  Solve  the  forward  algorithm  using  the  modified  forward  algorithm. 

•  Use  the  knowledge  of  the  computed  reflection  kernel  and  the  modified  inverse  algorithm  to 
reconstruct  Afa:). 

Input  and  the  reconstructed  A{x)  are  shown  in  Fig.  6. 

5.1.2  Case2;  Computations  using  conventional  forward  and  modified  inverse  algo¬ 
rithms 

•  A{x)  =  sin{4T!-x)  -|-  2 

•  Solve  the  forward  algorithm  using  the  conventional  forward  algorithm. 

•  Use  the  knowledge  of  the  computed  reflection  kernel  and  the  modified  inverse  algorithm  to 
reconstruct  A{x). 


Input  and  the  reconstructed  /l(i)  are  shown  in  Fig.  7. 

In  the  second  example,  yl(x)  is  piece  wise  continuous  function  with 


A{x} 


'5  0  <  I  <  10 

-5  10  <  X  <  16 

szn(4jrx)  4-5  16  <  x  <  50 
—5  50  <  I  <  64 

V  ■” 


Forward  and  the  inverse  problem  were  solved  using  the  modified  algorithms  in  parallel  implemen¬ 
tation.  The  resulting  reconstruction  of  i4(x)  is  plotted  in  Fig.  8. 

Computational  timings  of  the  conventional  forward  and  the  modified  parallel  forward  algorithms 
were  measured  for  various  discretization  step  sizes,  and  their  ratio  in  log  scale  was  ploted  in  Fig. 
9.  (The  timing  ratio  is  defined  as  the  log  [conventional  timing/modified  timing].)  Using  a 
least  square  fit  it  was  noted  that  the  parallel  version  of  the  algorithm  takes  Q{N^)  computations 
as  expected.  The  parallelized  and  the  sequential  versions  of  the  algorithms  were  implemented  in 
MasPar  using  MPL  C  language. 


6  Conclusion 


In  this  paper,  we  presented  a  reformulation  of  a  class  of  1-D  forward  and  inverse  scattering  problems, 
employing  invariant  imbedding  methods.  The  new  formulation  decouples  the  reflection  kernel  in  a 
given  layer. 

It  was  shown  that  the  smoothness  assumptions  made  in  the  formulation  of  the  original  problems 
were  sufficient  for  the  reformulation.  Moreover,  these  assumptions  also  led  to  efficient  initialization 
schemes  for  the  parallel  implementation  of  the  inverse  problem. 

The  error  in  the  reformulation  was  shown  to  be  controlled  by  the  step  size  (varied  as  the  square 
of  the  step  size).  Due  to  its  decoupling  nature,  the  new  formulation  completely  eliminated  any 
error  propagation  between  any  two  points  in  the  same  layer.  The  error  propagation  was  tractable 
for  the  forward  and  the  inverse  problems. 

Numerical  results  showed  the  performance  of  the  modified  algorithm  was  indeed  as  expected 
from  the  analytical  derivation  presented  in  the  paper. 
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Figure  3:  Computational  grid  for  the  inverse  problem 
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Figure  4:  Modified  computational  grid  for  the  forward  problem. 


15 


Figure  5:  Modified  computational  grid  for  the  inverse  problem. 


Figure  6:  Computations  using  modified  forward  and  inverse  algorithms. 
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Figure  9:  Computational  timing  ratio  of  the  conventional  forward  algorithm  cind  the  modified 
parallel  forward  algorithm  in  log  scale  as  a  function  of  the  discretization  points. 
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